Dataset Generation for AP-axis analysis

What are the genetic and regulatory mechanisms that drive the intense proliferation from the spinal-cord to the forebrain in bilateria?

We know that the silencing of the Hox genes in the forebrain facilitates the growth of the tissue, however, do the Hox genes act in isolation? Or are there other genes that exhibit the same phenotype and thus, may be equally integral to this process?

By removing the silencing mechanism (H3K27me3) we investigate which genes, act in a similar manner by integrating the RNAseq response to the knockout with epigenetic data.

Cells appendix:

1) Imports
2) RNAseq merging files
3) Add annotation information
4) Save dataframe combinations to csv files for DE analysis
5) Run R scripts a. DeSEQ2 & b. Normalisation (in R) plot results from normalisation
6) Plot results from DEseq2
7) Make bar chart of all comparisons (i.e. how many DE genes in each comaprison)
8) Select log2(TMM + 1) as normalisation and build merged dataframe (raw + DEseq2 results)
9) Visualise the merged data top most significant genes (Heatmap and PCA) 
10) Add histone modification data from Encode (downloaded & processed in APaxis_encodeDataDownload)
11) Filter the merged dataframe (i.e. for significant results)

1) Imports

In [1]:
"""
--------------------------------------------------------
Import RNAseq data & merge featureCounts files
--------------------------------------------------------
"""

import os, sys
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

import venn
from matplotlib_venn import venn3

from sciviso import *
from sciutil import SciUtil

u = SciUtil()

module_path = os.path.abspath(os.path.join('..'))
sys.path.append(module_path)


"""
--------------------------------------------------------
                     Global variables
--------------------------------------------------------
"""
date = '20210124'

gene_id = 'entrezgene_id'
gene_name = 'external_gene_name'

sci_colour = ['#483873', '#1BD8A6', '#B117B7', '#AAC7E2', '#FFC107', '#016957', '#9785C0', 
             '#D09139', '#338A03', '#FF69A1', '#5930B1', '#FFE884', '#35B567', '#1E88E5', 
             '#ACAD60', '#A2FFB4', '#B618F5', '#854A9C']

hist_colour = '#483873'

e11_colour = 'white'
e13_colour = 'lightgrey'
e15_colour = '#A7A9AC'
e18_colour = '#58595B'

fb_colour = '#ffbf80'
mb_colour = '#ff8c1a'
hb_colour = '#b35900'
sc_colour = '#663300'
grey = '#bdbdbd'

h3k36me3_colour = '#CEF471'
h3k27me3_colour = '#9F71F4'
h3k4me3_colour = '#9F00FA'
h3k4me2_colour = '#5930B1'
h3k4me1_colour = '#FFE884'
h3k27ac_colour = '#35B567'
h3k9me3_colour = '#1E88E5'
h3k9ac_colour = '#A2FFB4'
           
wt_colour = '#AADFF1'
ko_colour = '#A53736'

sns.palplot(sci_colour)
sns.color_palette(sci_colour)

project_name = 'prelim'

data_dir = '../data/'
r_dir = f'{data_dir}results/deseq2/'
fig_dir = f'../figures/{project_name}/'
output_dir = f'{data_dir}results/{project_name}/'
input_dir = f'{data_dir}input/'
rna_dir = f'{input_dir}feature_counts/'

cmap = 'RdBu_r'
u_id = 'u_id'
sns.palplot(sci_colour)
sns.color_palette(sci_colour)

def get_time_colour(c):
    if '11' in c or '10' in c:
        return e11_colour
    elif '13' in c or '12' in c:
        return e13_colour
    elif '15' in c or '14' in c:
        return e15_colour
    elif '18' in c or '16' in c:
        return e18_colour
    return '#FFFFFF'

def get_tissue_colour(c):
    if 'sc' in c or 'spinal' in c:
        return sc_colour
    elif 'hb' in c or 'hindbrain' in c:
        return hb_colour
    elif 'mb' in c or 'midbrain' in c:
        return mb_colour
    elif 'fb' in c or 'forebrain' in c:
        return fb_colour
    return '#FFFFFF'

def get_cond_colour(c):
    if 'ko' in c:
        return ko_colour
    elif 'wt' in c:
        return wt_colour
    return '#FFFFFF'

def get_mark_colour(c):
    if '36me3' in c:
        return h3k36me3_colour
    elif '27me3' in c:
        return h3k27me3_colour
    elif 'K4me3' in c:
        return h3k4me3_colour
    elif 'K4me2' in c:
        return h3k4me2_colour
    elif 'K4me1' in c:
        return h3k4me1_colour
    elif '27ac' in c:
        return h3k27ac_colour
    elif 'K9me3' in c:
        return h3k9me3_colour
    elif 'K9ac' in c:
        return h3k9ac_colour
    return '#FFFFFF'

def pplot():
    plt.rcParams['figure.figsize'] = [5, 4]
    plt.rcParams['image.cmap'] = cmap
    plt.rcParams['svg.fonttype'] = 'none'
        
def cplot():
    plt.rcParams['figure.figsize'] = [6, 4]
    sns.color_palette(sci_colour)
    
def save_fig(title, ending='.svg'):
    plt.savefig(f'{fig_dir}{title.replace(" ", "-")}{ending}')
    

2) RNAseq merging files

Here we import the results from our FeatureCounts data. We merge the files together on their entrez ID.

In [2]:
files = os.listdir(rna_dir)

rna_files = []
for f in files:
    if 'summary' not in f:
        rna_files.append(f)

# Make a df out of all expression data
rna_files.sort()
df = pd.DataFrame()
        
# Basically want to create a dummy df since we don't want the columns except the expression
# and the entrez gene id
tmp_df = pd.read_csv(f'{rna_dir}{rna_files[0]}', sep='\t', header=1)

df['Geneid'] = tmp_df['Geneid'].values
df[rna_files[0][:-4]] = tmp_df[tmp_df.columns[-1]].values
# Now we'll add all the other RNA files to this dataframe by joining on ID
for filename in rna_files[1:]:
    u.dp(["Adding", filename])
    tmp_df = pd.DataFrame()
    file_df = pd.read_csv(f'{rna_dir}{filename}', sep='\t', header=1)
    print(file_df.head())
    u.dp([f'Length of {filename}: ', len(file_df)])
    tmp_df['Geneid'] = file_df['Geneid'].values
    # Note we take the log transform here 
    tmp_df[filename[:-4]] = file_df[file_df.columns[-1]].values
    df = df.merge(tmp_df, on='Geneid', how='outer')

# Save DF to csv for normalisation 
u.dp(["Length of merged RNAseq dataframe: ", len(df)])
--------------------------------------------------------------------------------
                              Adding	ko11fb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko11fb2_sorted.bam  
0                                                 96      
1                                                  4      
2                                                  0      
3                                                  5      
4                                                359      
--------------------------------------------------------------------------------
                         Length of ko11fb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko11hb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko11hb1_sorted.bam  
0                                                101      
1                                                  3      
2                                                  0      
3                                                  3      
4                                                379      
--------------------------------------------------------------------------------
                         Length of ko11hb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko11hb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko11hb2_sorted.bam  
0                                                157      
1                                                 11      
2                                                  0      
3                                                 11      
4                                                282      
--------------------------------------------------------------------------------
                         Length of ko11hb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko11mb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko11mb1_sorted.bam  
0                                                177      
1                                                 10      
2                                                  0      
3                                                 17      
4                                                365      
--------------------------------------------------------------------------------
                         Length of ko11mb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko11mb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko11mb2_sorted.bam  
0                                                136      
1                                                 16      
2                                                  0      
3                                                  6      
4                                                349      
--------------------------------------------------------------------------------
                         Length of ko11mb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko11sc1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko11sc1_sorted.bam  
0                                                166      
1                                                 12      
2                                                  0      
3                                                  8      
4                                                362      
--------------------------------------------------------------------------------
                         Length of ko11sc1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko11sc2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko11sc2_sorted.bam  
0                                                177      
1                                                  6      
2                                                  0      
3                                                 24      
4                                                398      
--------------------------------------------------------------------------------
                         Length of ko11sc2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko13fb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko13fb1_sorted.bam  
0                                                599      
1                                                  7      
2                                                  0      
3                                                  2      
4                                                384      
--------------------------------------------------------------------------------
                         Length of ko13fb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko13fb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko13fb2_sorted.bam  
0                                                670      
1                                                  5      
2                                                  1      
3                                                  5      
4                                                355      
--------------------------------------------------------------------------------
                         Length of ko13fb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko13hb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko13hb1_sorted.bam  
0                                               1076      
1                                                  2      
2                                                  2      
3                                                 15      
4                                                445      
--------------------------------------------------------------------------------
                         Length of ko13hb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko13hb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko13hb2_sorted.bam  
0                                                923      
1                                                  6      
2                                                  0      
3                                                 11      
4                                                437      
--------------------------------------------------------------------------------
                         Length of ko13hb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko13mb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko13mb1_sorted.bam  
0                                                773      
1                                                  7      
2                                                  0      
3                                                  9      
4                                                388      
--------------------------------------------------------------------------------
                         Length of ko13mb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko13mb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko13mb2_sorted.bam  
0                                                665      
1                                                  3      
2                                                  0      
3                                                  5      
4                                                378      
--------------------------------------------------------------------------------
                         Length of ko13mb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko13sc1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko13sc1_sorted.bam  
0                                                929      
1                                                  0      
2                                                  2      
3                                                 10      
4                                                367      
--------------------------------------------------------------------------------
                         Length of ko13sc1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko13sc2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko13sc2_sorted.bam  
0                                                743      
1                                                  3      
2                                                  0      
3                                                  8      
4                                                391      
--------------------------------------------------------------------------------
                         Length of ko13sc2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko15fb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko15fb1_sorted.bam  
0                                               1125      
1                                                 64      
2                                                  0      
3                                                 10      
4                                                416      
--------------------------------------------------------------------------------
                         Length of ko15fb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko15fb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko15fb2_sorted.bam  
0                                               1037      
1                                                129      
2                                                  6      
3                                                  5      
4                                                492      
--------------------------------------------------------------------------------
                         Length of ko15fb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko15hb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko15hb1_sorted.bam  
0                                               1437      
1                                                 77      
2                                                  7      
3                                                 10      
4                                                342      
--------------------------------------------------------------------------------
                         Length of ko15hb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko15hb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko15hb2_sorted.bam  
0                                               1294      
1                                                316      
2                                                  9      
3                                                 13      
4                                                676      
--------------------------------------------------------------------------------
                         Length of ko15hb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko15mb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko15mb1_sorted.bam  
0                                               1311      
1                                                 99      
2                                                  6      
3                                                  6      
4                                                441      
--------------------------------------------------------------------------------
                         Length of ko15mb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko15mb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko15mb2_sorted.bam  
0                                                677      
1                                                296      
2                                                  8      
3                                                  5      
4                                                356      
--------------------------------------------------------------------------------
                         Length of ko15mb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko15sc1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko15sc1_sorted.bam  
0                                                918      
1                                                 98      
2                                                 10      
3                                                 12      
4                                                415      
--------------------------------------------------------------------------------
                         Length of ko15sc1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko15sc2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko15sc2_sorted.bam  
0                                               1141      
1                                                227      
2                                                  5      
3                                                 15      
4                                                597      
--------------------------------------------------------------------------------
                         Length of ko15sc2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko18fb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko18fb1_sorted.bam  
0                                               1163      
1                                                 39      
2                                                  7      
3                                                  8      
4                                                415      
--------------------------------------------------------------------------------
                         Length of ko18fb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko18fb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko18fb2_sorted.bam  
0                                               1237      
1                                                 52      
2                                                  6      
3                                                  5      
4                                                444      
--------------------------------------------------------------------------------
                         Length of ko18fb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko18hb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko18hb1_sorted.bam  
0                                                773      
1                                                 66      
2                                                  4      
3                                                  8      
4                                                535      
--------------------------------------------------------------------------------
                         Length of ko18hb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko18hb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko18hb2_sorted.bam  
0                                               1151      
1                                                 38      
2                                                  8      
3                                                  5      
4                                                420      
--------------------------------------------------------------------------------
                         Length of ko18hb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko18mb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko18mb1_sorted.bam  
0                                                959      
1                                                 43      
2                                                 14      
3                                                 10      
4                                                392      
--------------------------------------------------------------------------------
                         Length of ko18mb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko18mb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko18mb2_sorted.bam  
0                                               1717      
1                                                 32      
2                                                  8      
3                                                  0      
4                                                399      
--------------------------------------------------------------------------------
                         Length of ko18mb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko18sc1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko18sc1_sorted.bam  
0                                                771      
1                                                 58      
2                                                  7      
3                                                  8      
4                                                501      
--------------------------------------------------------------------------------
                         Length of ko18sc1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	ko18sc2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/ko18sc2_sorted.bam  
0                                               1556      
1                                                 53      
2                                                  3      
3                                                  7      
4                                                384      
--------------------------------------------------------------------------------
                         Length of ko18sc2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt11fb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt11fb1_sorted.bam  
0                                                104      
1                                                 11      
2                                                  0      
3                                                  1      
4                                                349      
--------------------------------------------------------------------------------
                         Length of wt11fb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt11fb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt11fb2_sorted.bam  
0                                                 56      
1                                                  2      
2                                                  0      
3                                                  4      
4                                                353      
--------------------------------------------------------------------------------
                         Length of wt11fb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt11hb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt11hb1_sorted.bam  
0                                                205      
1                                                  7      
2                                                  0      
3                                                 20      
4                                                345      
--------------------------------------------------------------------------------
                         Length of wt11hb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt11hb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt11hb2_sorted.bam  
0                                                139      
1                                                 13      
2                                                  1      
3                                                 15      
4                                                377      
--------------------------------------------------------------------------------
                         Length of wt11hb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt11mb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt11mb1_sorted.bam  
0                                                164      
1                                                  1      
2                                                  0      
3                                                 12      
4                                                298      
--------------------------------------------------------------------------------
                         Length of wt11mb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt11mb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt11mb2_sorted.bam  
0                                                118      
1                                                  7      
2                                                  0      
3                                                  6      
4                                                399      
--------------------------------------------------------------------------------
                         Length of wt11mb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt11sc1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt11sc1_sorted.bam  
0                                                152      
1                                                  2      
2                                                  0      
3                                                  9      
4                                                364      
--------------------------------------------------------------------------------
                         Length of wt11sc1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt11sc2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt11sc2_sorted.bam  
0                                                103      
1                                                  6      
2                                                  0      
3                                                 15      
4                                                379      
--------------------------------------------------------------------------------
                         Length of wt11sc2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt13fb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt13fb1_sorted.bam  
0                                                354      
1                                                  1      
2                                                  0      
3                                                  0      
4                                                296      
--------------------------------------------------------------------------------
                         Length of wt13fb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt13fb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt13fb2_sorted.bam  
0                                                341      
1                                                  4      
2                                                  0      
3                                                  0      
4                                                287      
--------------------------------------------------------------------------------
                         Length of wt13fb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt13hb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt13hb1_sorted.bam  
0                                                896      
1                                                  9      
2                                                  1      
3                                                 13      
4                                                203      
--------------------------------------------------------------------------------
                         Length of wt13hb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt13hb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt13hb2_sorted.bam  
0                                               1059      
1                                                  4      
2                                                  2      
3                                                 14      
4                                                237      
--------------------------------------------------------------------------------
                         Length of wt13hb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt13mb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt13mb1_sorted.bam  
0                                                584      
1                                                 11      
2                                                  0      
3                                                  5      
4                                                313      
--------------------------------------------------------------------------------
                         Length of wt13mb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt13mb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt13mb2_sorted.bam  
0                                                522      
1                                                  6      
2                                                  0      
3                                                  2      
4                                                241      
--------------------------------------------------------------------------------
                         Length of wt13mb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt13sc1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt13sc1_sorted.bam  
0                                                771      
1                                                  4      
2                                                  0      
3                                                 12      
4                                                359      
--------------------------------------------------------------------------------
                         Length of wt13sc1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt13sc2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt13sc2_sorted.bam  
0                                                803      
1                                                  9      
2                                                  1      
3                                                  8      
4                                                369      
--------------------------------------------------------------------------------
                         Length of wt13sc2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt15fb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt15fb1_sorted.bam  
0                                               1076      
1                                                 46      
2                                                  4      
3                                                 16      
4                                                361      
--------------------------------------------------------------------------------
                         Length of wt15fb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt15fb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt15fb2_sorted.bam  
0                                                140      
1                                                110      
2                                                  2      
3                                                 10      
4                                                614      
--------------------------------------------------------------------------------
                         Length of wt15fb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt15hb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt15hb1_sorted.bam  
0                                               1823      
1                                                 74      
2                                                 12      
3                                                  7      
4                                                302      
--------------------------------------------------------------------------------
                         Length of wt15hb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt15hb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt15hb2_sorted.bam  
0                                                790      
1                                                387      
2                                                 10      
3                                                  5      
4                                                601      
--------------------------------------------------------------------------------
                         Length of wt15hb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt15mb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt15mb1_sorted.bam  
0                                               1399      
1                                                 90      
2                                                  3      
3                                                  7      
4                                                329      
--------------------------------------------------------------------------------
                         Length of wt15mb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt15mb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt15mb2_sorted.bam  
0                                                744      
1                                                153      
2                                                  4      
3                                                  2      
4                                                328      
--------------------------------------------------------------------------------
                         Length of wt15mb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt15sc1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt15sc1_sorted.bam  
0                                                863      
1                                                 64      
2                                                  7      
3                                                  8      
4                                                376      
--------------------------------------------------------------------------------
                         Length of wt15sc1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt15sc2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt15sc2_sorted.bam  
0                                               1196      
1                                                136      
2                                                 12      
3                                                 19      
4                                                650      
--------------------------------------------------------------------------------
                         Length of wt15sc2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt18fb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt18fb1_sorted.bam  
0                                               1274      
1                                                 33      
2                                                  3      
3                                                  9      
4                                                330      
--------------------------------------------------------------------------------
                         Length of wt18fb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt18fb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt18fb2_sorted.bam  
0                                               1260      
1                                                 48      
2                                                  1      
3                                                  8      
4                                                353      
--------------------------------------------------------------------------------
                         Length of wt18fb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt18hb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt18hb1_sorted.bam  
0                                               1419      
1                                                 45      
2                                                 13      
3                                                  3      
4                                                404      
--------------------------------------------------------------------------------
                         Length of wt18hb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt18hb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt18hb2_sorted.bam  
0                                               1869      
1                                                 26      
2                                                  4      
3                                                  3      
4                                                393      
--------------------------------------------------------------------------------
                         Length of wt18hb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt18mb1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt18mb1_sorted.bam  
0                                               1788      
1                                                 21      
2                                                  9      
3                                                  6      
4                                                369      
--------------------------------------------------------------------------------
                         Length of wt18mb1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt18mb2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt18mb2_sorted.bam  
0                                               1791      
1                                                 26      
2                                                  7      
3                                                  6      
4                                                250      
--------------------------------------------------------------------------------
                         Length of wt18mb2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt18sc1.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt18sc1_sorted.bam  
0                                               1299      
1                                                 43      
2                                                 12      
3                                                  5      
4                                                365      
--------------------------------------------------------------------------------
                         Length of wt18sc1.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Adding	wt18sc2.txt	                               
--------------------------------------------------------------------------------
      Geneid                            Chr  \
0     497097                 chr1;chr1;chr1   
1  100503874                      chr1;chr1   
2  100038431                           chr1   
3      19888  chr1;chr1;chr1;chr1;chr1;chr1   
4      20671       chr1;chr1;chr1;chr1;chr1   

                                             Start  \
0                          3214482;3421702;3670552   
1                                  3647309;3658847   
2                                          3680155   
3  4290846;4343507;4351910;4352202;4360200;4409170   
4          4490928;4493100;4493772;4495136;4496291   

                                               End       Strand  Length  \
0                          3216968;3421901;3671498        -;-;-    3634   
1                                  3650509;3658904          -;-    3259   
2                                          3681788            +    1634   
3  4293012;4350091;4352081;4352837;4360314;4409241  -;-;-;-;-;-    9747   
4          4492668;4493466;4493863;4495942;4496413    -;-;-;-;-    3130   

   output/thor/HISAT2_MAPPED_04052020/wt18sc2_sorted.bam  
0                                               1456      
1                                                 55      
2                                                  4      
3                                                  5      
4                                                357      
--------------------------------------------------------------------------------
                         Length of wt18sc2.txt: 	27179	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                   Length of merged RNAseq dataframe: 	27179	                   
--------------------------------------------------------------------------------

3) Add annotation information

We add annotation information to the genes: name, etc. This will also allow us to merge with our ChIP data which have been annotated with ensembl IDs.

We also want to save the merged dataframe to CSV files (grouped by the tissue) as then we will run DeSEQ2 on these count files.

In [3]:
"""
--------------------------------------------------------
Add annotation to RNAseq dataframe & save Tissue counts as files.
--------------------------------------------------------
"""
# Lastly let's add some gene annotation information

"""
Annotations were generated using scibiomart v1.0.0 CLI

scibiomart --m ENSEMBL_MART_ENSEMBL --d mmusculus_gene_ensembl --a "ensembl_gene_id,entrezgene_id,external_gene_name,chromosome_name,start_position,end_position,strand" --o mm10Sorted_ --s t

"""

# Remove data that were not able to be annotated to gene names (i.e. these are usually the genes which are "unknown")

annot_df = pd.read_csv(os.path.join(input_dir, 'supps', 'mm10Sorted_mmusculus_gene_ensembl-GRCm38.p6.csv'))
gene_names = annot_df[gene_name].values
gene_id_to_name = {}
for i, g in enumerate(annot_df[gene_id].values):
    if not gene_id_to_name.get(g):
        gene_id_to_name[g] = gene_names[i]

# add these to the df column
gene_names = []
pseudo_id = []
df = df.rename(columns={'Geneid': 'entrezgene_id'})

for g in df[gene_id].values:
    gene_n = gene_id_to_name.get(g)
    gene_names.append(gene_n)
    pseudo_id.append(f'{g}-{gene_n}')
df[gene_name] = gene_names
df['u_id'] = pseudo_id

# Organise columns a bit nicer
cols = [gene_id, gene_name, 'u_id',
        'ko11fb1', 'ko11fb2', 'ko13fb1', 'ko13fb2', 'ko15fb1', 'ko15fb2', 'ko18fb1', 'ko18fb2',
        'ko11mb1', 'ko11mb2', 'ko13mb1', 'ko13mb2', 'ko15mb1', 'ko15mb2', 'ko18mb1', 'ko18mb2',
        'ko11hb1', 'ko11hb2', 'ko13hb1', 'ko13hb2', 'ko15hb1', 'ko15hb2', 'ko18hb1', 'ko18hb2',
        'ko11sc1', 'ko11sc2', 'ko13sc1', 'ko13sc2', 'ko15sc1', 'ko15sc2', 'ko18sc1', 'ko18sc2',
         'wt11fb1', 'wt11fb2', 'wt13fb1', 'wt13fb2', 'wt15fb1', 'wt15fb2', 'wt18fb1', 'wt18fb2',
        'wt11mb1', 'wt11mb2', 'wt13mb1', 'wt13mb2', 'wt15mb1', 'wt15mb2', 'wt18mb1', 'wt18mb2',
        'wt11hb1', 'wt11hb2', 'wt13hb1', 'wt13hb2', 'wt15hb1', 'wt15hb2', 'wt18hb1', 'wt18hb2',
        'wt11sc1', 'wt11sc2', 'wt13sc1', 'wt13sc2', 'wt15sc1', 'wt15sc2', 'wt18sc1', 'wt18sc2'
]

df = df[cols]
u.dp(["Number of genes in total:", len(df)])
# Remove the values that don't map to a gene name. On investigation these are 
# most commonly predicted genes, or non-coding RNA (which we don't expect to have messenger RNA for)
# for example: 2610203C22Rik RIKEN cDNA 2610203C22 gene [ Mus musculus (house mouse) ]
# There were about 6000 genes. None of which had a logFC > 2 with the majority having 0 LogFC
df = df[df[gene_name].values != None]
u.dp(["Number of genes with gene names:", len(df)])
df.to_csv(f'{output_dir}merged_df_FEATURE_COUNTS_annot_{date}.csv', index=False)
--------------------------------------------------------------------------------
                        Number of genes in total:	27179	                        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                    Number of genes with gene names:	20900	                     
--------------------------------------------------------------------------------

4) Save dataframe combinations to csv files

Here we just save each combination that we want to run DE on to a CSV file.

In [5]:
"""
Save dataframe combinations to CSV files to do DEseq2 analysis
"""

# Create DF for each tissue
tissues = ['fb', 'mb', 'hb', 'sc']
for t in tissues:
    ts = [u_id]
    for c in df.columns:
        if t in c and '11' not in c and 'id' not in c:
            ts.append(c)
    t_df = df[ts]
    t_df.to_csv(f'{r_dir}merged_df_{t}_FEATURE_COUNTS_{date}.csv', index=False)
    
# Create DF for each timepoint
times = ['11', '13', '15', '18']
for t in times:
    ts = [u_id]
    for c in df.columns:
        if t in c and 'id' not in c and ('fb' in c or 'mb' in c):
            ts.append(c)
    t_df = df[ts]
    t_df.to_csv(f'{r_dir}merged_df_anterior_{t}_FEATURE_COUNTS_{date}.csv', index=False)

for t in times:
    ts = [u_id]
    for c in df.columns:
        if t in c and 'id' not in c and ('sc' in c or 'hb' in c):
            ts.append(c)
    t_df = df[ts]
    t_df.to_csv(f'{r_dir}merged_df_posterior_{t}_FEATURE_COUNTS_{date}.csv', index=False)

    
# Create df for all the data (including e11)
columns = [gene_id]
for c in df.columns:
    if 'wt' in c or 'ko' in c:
        columns.append(c)
df[columns].to_csv(f'{r_dir}merged_df_FEATURE_COUNTS_{date}.csv', index=False)

# Create combo between each tissue (i.e sc vs mb)
def gen_csv_combo(cond1, cond2):
    condition = ['wt', 'ko']
    for t in condition:
        ts = [u_id]
        for c in df.columns:
            if t in c and (cond1 in c or cond2 in c) and gene_id != c and gene_name != c and '11' not in c:
                ts.append(c)
        t_df = df[ts]
        t_df.to_csv(f'{r_dir}merged_df_{t}_{cond1}-{cond2}_FEATURE_COUNTS_{date}.csv', index=False)
        u.dp(["Cond done:", f'{t}_{cond1}-{cond2}'])

def gen_csv_combo_time(cond1, cond2):
    condition = ['wt', 'ko']
    for t in condition:
        ts = [u_id]
        for c in df.columns:
            if t in c and (cond1 in c or cond2 in c) and gene_id != c and gene_name != c and ('sc' in c or 'hb' in c):
                ts.append(c)
        t_df = df[ts]
        t_df.to_csv(f'{r_dir}merged_df_posterior_{t}_{cond1}-{cond2}_FEATURE_COUNTS_{date}.csv', index=False)
        u.dp(["Cond done:", f'posterior_{t}_{cond1}-{cond2}'])

    for t in condition:
        ts = [u_id]
        for c in df.columns:
            if t in c and (cond1 in c or cond2 in c) and gene_id != c and gene_name != c and ('fb' in c or 'mb' in c):
                ts.append(c)
        t_df = df[ts]
        t_df.to_csv(f'{r_dir}merged_df_anterior_{t}_{cond1}-{cond2}_FEATURE_COUNTS_{date}.csv', index=False)
        u.dp(["Cond done:", f'anterior_{t}_{cond1}-{cond2}'])
        
gen_csv_combo('fb', 'mb')
gen_csv_combo('fb', 'hb')
gen_csv_combo('fb', 'sc')
gen_csv_combo('mb', 'sc')
gen_csv_combo('mb', 'hb')
gen_csv_combo('hb', 'sc')
gen_csv_combo_time('11', '18')
gen_csv_combo_time('11', '13')
gen_csv_combo_time('13', '18')
gen_csv_combo_time('15', '18')
gen_csv_combo_time('11', '15')
gen_csv_combo_time('13', '15')

df.to_csv(f'{r_dir}merged_df_FEATURE_COUNTS_annot_{date}.csv', index=False)

u.dp(["Combinations generated!"])
--------------------------------------------------------------------------------
                              Cond done:	wt_fb-mb	                              
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Cond done:	ko_fb-mb	                              
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Cond done:	wt_fb-hb	                              
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Cond done:	ko_fb-hb	                              
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Cond done:	wt_fb-sc	                              
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Cond done:	ko_fb-sc	                              
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Cond done:	wt_mb-sc	                              
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Cond done:	ko_mb-sc	                              
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Cond done:	wt_mb-hb	                              
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Cond done:	ko_mb-hb	                              
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Cond done:	wt_hb-sc	                              
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Cond done:	ko_hb-sc	                              
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	posterior_wt_11-18	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	posterior_ko_11-18	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	anterior_wt_11-18	                          
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	anterior_ko_11-18	                          
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	posterior_wt_11-13	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	posterior_ko_11-13	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	anterior_wt_11-13	                          
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	anterior_ko_11-13	                          
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	posterior_wt_13-18	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	posterior_ko_13-18	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	anterior_wt_13-18	                          
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	anterior_ko_13-18	                          
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	posterior_wt_15-18	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	posterior_ko_15-18	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	anterior_wt_15-18	                          
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	anterior_ko_15-18	                          
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	posterior_wt_11-15	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	posterior_ko_11-15	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	anterior_wt_11-15	                          
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	anterior_ko_11-15	                          
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	posterior_wt_13-15	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	posterior_ko_13-15	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	anterior_wt_13-15	                          
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Cond done:	anterior_ko_13-15	                          
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                            Combinations generated!	                            
--------------------------------------------------------------------------------

5) Run R scripts a. DeSEQ2 & b. Normalisation

Here we need to step out of this notebook and run the following R markdowns:

2) Run Deseq2 (apAxis_between-cond-tissue.Rmd, apAxis_between-time.Rmd, apAxis_between-tissue.Rmd)
3) Normalise the data (apAxis_between-cond-tissue.Rmd)

Prior to running our analysis, we need to normalise the RNAseq data. Given we're interested in integrating the changes across the genes, we use EdgeRs TMM method.

Script below for normalisation

Run script in file: apAxis_between-cond-tissue.Rmd
In [6]:
"""
--------------------------------------------------------
Display the normalisations 
--------------------------------------------------------
"""
%matplotlib inline
from sciviso import Histogram, Scatterplot

plot_on = True
plt_on = True
save_on = True
# Show an example of each method's normalisation
def show_scatter(df, cond_x, cond_y, x_label='', y_label='', title='', log2=False, offset=1, annotations=None): 
    a = 0.1
    offset = 1
    plt_df = pd.DataFrame()
    x = cond_x
    y = cond_y
    if log2:
        x = f'Log2({cond_x} + {offset})'
        x = f'Log2({cond_y} + {offset})'
        plt_df[x] = np.log2(df[cond_x].values + offset)
        plt_df[y] = np.log2(df[cond_y].values + offset)
    else:
        plt_df[x] = np.log2(df[cond_x].values + offset)
        plt_df[y] = np.log2(df[cond_y].values + offset)
    
    # plot histogram 
    h = Histogram(plt_df, x, title=title, fit_norm=False, plot_rug=True, xlabel=x)
    h.plot()
    save_fig(f'Hist_{x}_{title}', ending='.pdf')
    plt.show()
    # now run the scatter
    s = Scatterplot(plt_df, x, y, title=title, xlabel=x, ylabel=y, colour="grey", add_legend=False)
    s.opacity=0.5
    s.plot()
    save_fig(f'Scatter_{x}-{y}_{title}', ending='.pdf')
    plt.show()

    
deseq2 = pd.read_csv(f'{r_dir}merged_df_FEATURE_COUNTS_DEseq2Norm_{date}.csv')
rlog = pd.read_csv(f'{r_dir}merged_df_FEATURE_COUNTS_rlog_{date}.csv')
tmm = pd.read_csv(f'{r_dir}merged_df_FEATURE_COUNTS_tmm_{date}.csv')
vst = pd.read_csv(f'{r_dir}merged_df_FEATURE_COUNTS_vst_{date}.csv')

sc_genes = None
show_scatter(deseq2, 'wt11fb1', 'ko11fb1', 'wt11fb1', 'ko11fb1', 'Forebrain 11 WT vs KO DEseq2', True, 1.0, sc_genes)
show_scatter(deseq2, 'wt11fb1', 'wt11fb2', 'wt11fb1', 'wt11fb2', 'Forebrain WT vs WT DEseq2', True, 1.0, sc_genes)
show_scatter(rlog, 'wt11fb1', 'ko11fb1', 'wt11fb1', 'ko11fb1', 'Forebrain 11 WT vs KO Rlog', False, 1.0, sc_genes)
show_scatter(rlog, 'wt11fb1', 'wt11fb2', 'wt11fb1', 'wt11fb2', 'Forebrain WT vs WT Rlog', False, 1.0, sc_genes)
show_scatter(tmm, 'wt11fb1', 'ko11fb1', 'wt11fb1', 'ko11fb1', 'Forebrain 11 WT vs KO TMM', True, 1.0, sc_genes)
show_scatter(tmm, 'wt11fb1', 'wt11fb2', 'wt11fb1', 'wt11fb2', 'Forebrain WT vs WT TMM', True, 1.0, sc_genes)
show_scatter(vst, 'wt11fb1', 'ko11fb1', 'wt11fb1', 'ko11fb1', 'Forebrain 11 WT vs KO VST', False, 1.0, sc_genes)
show_scatter(vst, 'wt11fb1', 'wt11fb2', 'wt11fb1', 'wt11fb2', 'Forebrain WT vs WT VST', False, 1.0, sc_genes)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/seaborn/distributions.py:2557: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
  warnings.warn(msg, FutureWarning)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/seaborn/distributions.py:2056: FutureWarning: The `axis` variable is no longer used and will be removed. Instead, assign variables directly to `x` or `y`.
  warnings.warn(msg, FutureWarning)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/seaborn/distributions.py:2557: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
  warnings.warn(msg, FutureWarning)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/seaborn/distributions.py:2056: FutureWarning: The `axis` variable is no longer used and will be removed. Instead, assign variables directly to `x` or `y`.
  warnings.warn(msg, FutureWarning)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/seaborn/distributions.py:2557: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
  warnings.warn(msg, FutureWarning)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/seaborn/distributions.py:2056: FutureWarning: The `axis` variable is no longer used and will be removed. Instead, assign variables directly to `x` or `y`.
  warnings.warn(msg, FutureWarning)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/seaborn/distributions.py:2557: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
  warnings.warn(msg, FutureWarning)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/seaborn/distributions.py:2056: FutureWarning: The `axis` variable is no longer used and will be removed. Instead, assign variables directly to `x` or `y`.
  warnings.warn(msg, FutureWarning)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/seaborn/distributions.py:2557: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
  warnings.warn(msg, FutureWarning)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/seaborn/distributions.py:2056: FutureWarning: The `axis` variable is no longer used and will be removed. Instead, assign variables directly to `x` or `y`.
  warnings.warn(msg, FutureWarning)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/seaborn/distributions.py:2557: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
  warnings.warn(msg, FutureWarning)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/seaborn/distributions.py:2056: FutureWarning: The `axis` variable is no longer used and will be removed. Instead, assign variables directly to `x` or `y`.
  warnings.warn(msg, FutureWarning)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/seaborn/distributions.py:2557: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
  warnings.warn(msg, FutureWarning)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/seaborn/distributions.py:2056: FutureWarning: The `axis` variable is no longer used and will be removed. Instead, assign variables directly to `x` or `y`.
  warnings.warn(msg, FutureWarning)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/seaborn/distributions.py:2557: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
  warnings.warn(msg, FutureWarning)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/seaborn/distributions.py:2056: FutureWarning: The `axis` variable is no longer used and will be removed. Instead, assign variables directly to `x` or `y`.
  warnings.warn(msg, FutureWarning)

6) Plot results from DEseq2

Here we want to do the normal figures for a DE analysis i.e. volcano plots, heatmaps, & Venn diagrams etc.

In [7]:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from adjustText import adjust_text

from sciviso import Vis


class Volcanoplot(Vis):

    def __init__(self, df: pd.DataFrame, log_fc: str, p_val: str, label_column: str, title='',
                 xlabel='', ylabel='', invert=False, p_val_cutoff=0.05,
                 log_fc_cuttoff=2, label_big_sig=False, colours=None, offset=None,
                 text_colours=None, values_to_label=None, max_labels=20, values_colours=None,
                 figsize=(2, 2), title_font_size=8, label_font_size=6, title_font_weight=700):
        super().__init__(df, figsize=figsize, title_font_size=title_font_size, label_font_size=label_font_size,
                         title_font_weight=title_font_weight)
        super().__init__(df)
        self.log_fc = log_fc
        self.p_val = p_val
        self.p_val_cutoff = p_val_cutoff
        self.log_fc_cuttoff = log_fc_cuttoff
        self.values_to_label = values_to_label
        self.label_big_sig = label_big_sig
        self.invert = invert
        self.label_column = label_column
        self.offset = offset
        self.label = 'volcanoplot'
        self.colours = {'ns_small-neg-logFC': 'lightgrey',
                        'ns_small-pos-logFC': 'lightgrey',
                        'ns_big-neg-logFC': 'grey',
                        'ns_big-pos-logFC': 'grey',
                        'sig_small-neg-logFC': '#2970b1',
                        'sig_small-pos-logFC': '#d6604c',
                        'sig_big-neg-logFC': '#0a3568',
                        'sig_big-pos-logFC': '#6f0220'} if colours is None else colours
        self.xlabel = xlabel
        self.ylabel = ylabel
        self.title = title
        self.max_labels = max_labels
        self.values_colours = values_colours or {}
        self.text_colours = text_colours or {}

    def add_scatter_and_annotate(self, fig: plt, x_all: np.array, y_all: np.array,
                                 colour: str, idxs: np.array, annotate=False):
        x = x_all[idxs]
        y = y_all[idxs]
        ax = fig.scatter(x, y, c=colour, alpha=self.opacity, s=10, vmin=-10.0, vmax=10.0)

        # Check if we want to annotate any of these with their gene IDs

        if self.values_to_label is not None:
            texts = []
            labels = self.df[self.label_column].values[idxs]
            for i, name in enumerate(labels):
                if name in self.values_to_label:
                    lbl_bg = self.values_colours.get(name)
                    color = self.text_colours.get(name)
                    texts.append(fig.text(x[i], y[i], name, color=color, fontsize=6,
                                          bbox=dict(fc=lbl_bg, alpha=1.0, boxstyle='round,pad=0.1', lw=0)))
            adjust_text(texts, force_text=2.0)
        # Check if the user wants these labeled
        if self.label_big_sig and annotate:
            # If they do have a limit on the number of ones we show (i.e. we don't want 10000 gene names...)
            max_values = -1 * self.max_labels
            if len(y) < self.max_labels:
                max_values = -1 * (len(y) - 1)
            most_sig_idxs = np.argpartition(y, max_values)[max_values:]
            labels = self.df[self.label_column].values[idxs][most_sig_idxs]
            x = x[most_sig_idxs]
            y = y[most_sig_idxs]
            # We only label the ones with the max log fc
            for i, name in enumerate(labels):
                fig.annotate(name, (x[i], y[i]),
                             xytext=(0, 10),
                             textcoords='offset points', ha='center', va='bottom',
                             bbox=dict(boxstyle='round,pad=0',
                                       fc=None, alpha=0.2)
                             )
        return ax

    def plot(self):
        """
        For annotation styling see: https://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.annotate
        Returns
        -------

        """
        # if offset is not given, make the offset the smallest value in the dataset
        if not self.offset:
            vals = self.df[self.p_val].values
            self.offset = np.min(vals[np.nonzero(vals)])
            self.u.warn_p(['No offset was provided, setting offset to be smallest value recorded in dataset: ',
                           self.offset])

        # x axis has log_fc, first only plot the values < cutoff
        x = self.df[self.log_fc].values
        y = -1 * np.log10(self.df[self.p_val].values + self.offset)

        log_fc_np = self.df[self.log_fc].values
        p_val_np = self.df[self.p_val].values

        if self.invert:
            x = -1 * np.log10(self.df[self.p_val].values + self.offset)
            y = self.df[self.log_fc].values
        sig_small_pos_logfc = np.where((p_val_np <= self.p_val_cutoff) & (np.abs(log_fc_np) < self.log_fc_cuttoff)
                                       & (log_fc_np > 0))
        sig_big_pos_logfc = np.where((p_val_np <= self.p_val_cutoff) & (np.abs(log_fc_np) >= self.log_fc_cuttoff)
                                     & (log_fc_np > 0))

        sig_small_neg_logfc = np.where((p_val_np <= self.p_val_cutoff) & (np.abs(log_fc_np) < self.log_fc_cuttoff)
                                       & (log_fc_np <= 0))
        sig_big_neg_logfc = np.where((p_val_np <= self.p_val_cutoff) & (np.abs(log_fc_np) >= self.log_fc_cuttoff)
                                     & (log_fc_np <= 0))

        # Plot the points
        fig, ax = plt.subplots(figsize=(2.5, 2.5))
        self.add_scatter_and_annotate(ax, x, y, self.colours['sig_small-pos-logFC'], sig_small_pos_logfc)
        self.add_scatter_and_annotate(ax, x, y, self.colours['sig_big-pos-logFC'], sig_big_pos_logfc, annotate=True)

        # Negative
        self.add_scatter_and_annotate(ax, x, y, self.colours['sig_small-neg-logFC'], sig_small_neg_logfc)
        self.add_scatter_and_annotate(ax, x, y, self.colours['sig_big-neg-logFC'], sig_big_neg_logfc, annotate=True)
        self.add_labels()
        ax.tick_params(labelsize=6)
        ax.tick_params(direction='out', length=2, width=0.5)
        ax.spines['bottom'].set_linewidth(0.5)
        ax.spines['top'].set_linewidth(0)
        ax.spines['left'].set_linewidth(0.5)
        ax.spines['right'].set_linewidth(0)
        ax.tick_params(labelsize=6)
        ax.tick_params(axis='x', which='major', pad=0)
        ax.tick_params(axis='y', which='major', pad=0)
        return ax
In [11]:
"""
---------------------------------------------------------------
            Read in results from DEseq2 and format DFs nicer
---------------------------------------------------------------
"""

df_dict = {}
deseq2_files = os.listdir(r_dir)
for f in deseq2_files:
    if 'DEseq2' in f:
        try:
            de_df = pd.read_csv(os.path.join(r_dir, f))
            de_df = de_df.rename(columns={de_df.columns[0]: 'u_id'})
            gene_names = [s.split('-')[1] for s in list(de_df['u_id'].values)]
            gene_ids = [s.split('-')[0] for s in list(de_df['u_id'].values)]
            de_df['padj'] = de_df['padj'].fillna(1) # Replace Nan p values with 1.0s
            # Replace Nans with 0's for other values
            de_df = de_df.replace(np.nan, 0)
            de_df[gene_id] = gene_ids
            de_df[gene_name] = gene_names
            df_dict[f] = de_df
        except:
            print(f)
            
"""
---------------------------------------------------------------
            Set up markers to display on Volcano plot
---------------------------------------------------------------
"""
fb_genes = ['Emx1', 'Eomes', 'Tbr1', 'Foxg1', 'Lhx6']
mb_genes = ['En1', 'En2', 'Lmx1a', 'Bhlhe23', 'Sall4']
hb_genes = ['Phox2b', 'Krox20', 'Fev', 'Hoxb1',  'Hoxd3']
sc_genes = ['Hoxd8', 'Hoxd9', 'Hoxd10', 'Hoxd11','Hoxa7', 'Hoxa9', 'Hoxa10',
            'Hoxb9', 'Hoxb13',  'Hoxc8', 'Hoxc9', 'Hoxc10', 'Hoxc11', 'Hoxc12', 'Hoxc13']
progenitors = ['Sox2', 'Sox1', 'Sox3', 'Hes1', 'Hes5']
neurons = ['Tubb3', 'Snap25', 'Syt1', 'Slc32a1','Slc17a6']
glia = ['Pdgfra', 'Cspg4', 'Aqp4', 'Egfr', 'Slc6a11']

vals_to_label = fb_genes + mb_genes + hb_genes + sc_genes + progenitors + neurons + glia
lbl_colours = {}
text_colours = {}

for g in fb_genes:
    lbl_colours[g] = fb_colour
    text_colours[g] = 'black'
for g in mb_genes:
    lbl_colours[g] = mb_colour
    text_colours[g] = 'black'
for g in hb_genes:
    lbl_colours[g] =  hb_colour
    text_colours[g] = 'white'
for g in sc_genes:
    lbl_colours[g] = sc_colour
    text_colours[g] = 'white'
for g in progenitors:
    lbl_colours[g] = e11_colour
    text_colours[g] = 'black'
for g in neurons:
    lbl_colours[g] = e13_colour
    text_colours[g] = 'black'
for g in glia:
    lbl_colours[g] = e15_colour 
    text_colours[g] = 'black'
    
def plot_volcano(df, title, save=True, show=True):
    volcanoplot = Volcanoplot(df, 'log2FoldChange', 'padj', gene_name, 
                              title, 'Log 2 Fold change', '-log10(p adj)', 
                              p_val_cutoff=0.05,
                              label_big_sig=False, log_fc_cuttoff=1.5, 
                              values_to_label=vals_to_label, figsize=(2,2),
                              values_colours=lbl_colours, text_colours=text_colours)
    sns.set_style("ticks")
    volcanoplot.plot()
    if save:
        save_fig(f'Volcano{title.replace(" ", "")}', ending='.pdf')
    if show:
        plt.show()

"""
---------------------------------------------------------------
            Plot volcanos
---------------------------------------------------------------
"""
plot_volcano(df_dict[f'DEseq2_CNS_fb_{date}.csv'], "FB EedcKO vs WT")
plot_volcano(df_dict[f'DEseq2_CNS_mb_{date}.csv'], "MB EedcKO vs WT")
plot_volcano(df_dict[f'DEseq2_CNS_hb_{date}.csv'], "HB EedcKO vs WT")
plot_volcano(df_dict[f'DEseq2_CNS_sc_{date}.csv'], "SC EedcKO vs WT")
plot_volcano(df_dict[f'DEseq2_CNS_anterior_11_{date}.csv'], "Anterior e11.5 EedcKO vs WT")
plot_volcano(df_dict[f'DEseq2_CNS_anterior_13_{date}.csv'], "Anterior e13.5 EedcKO vs WT")
plot_volcano(df_dict[f'DEseq2_CNS_anterior_15_{date}.csv'], "Anterior e15.5 EedcKO vs WT")
plot_volcano(df_dict[f'DEseq2_CNS_anterior_18_{date}.csv'], "Anterior e18.5 EedcKO vs WT")
plot_volcano(df_dict[f'DEseq2_CNS_posterior_11_{date}.csv'], "Posterior e11.5 EedcKO vs WT")
plot_volcano(df_dict[f'DEseq2_CNS_posterior_13_{date}.csv'], "Posterior e13.5 EedcKO vs WT")
plot_volcano(df_dict[f'DEseq2_CNS_posterior_15_{date}.csv'], "Posterior e15.5 EedcKO vs WT")
plot_volcano(df_dict[f'DEseq2_CNS_posterior_18_{date}.csv'], "Posterior e18.5 EedcKO vs WT")
merged_df_FEATURE_COUNTS_DEseq2Norm_20210124.csv
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	1.1990197435926193e-181	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	9.120224411470536e-171	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	5.045529900366048e-178	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	7.605542350443188e-164	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	1.09154918056555e-20	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	2.1395725981525286e-232	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	4.814710618865448e-127	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	4.737122006808789e-170	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	6.37894815569963e-18	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	5.257117117932699e-142	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	1.2842525207369799e-73	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	2.227755715762089e-206	
--------------------------------------------------------------------------------
In [12]:
"""
---------------------------------------------------------------
            Plot venn diagrams
---------------------------------------------------------------
"""
import venn
from matplotlib.colors import ListedColormap

def plot_venn(gene_sets, labels, title, save=True, show=True, colours=None):
    output = f'venn_{title.replace(" ", "-")}'
    dset = {}
    for i, l in enumerate(labels):
        dset[l] = gene_sets[i]
    if len(gene_sets) > 3:
        cmap = ListedColormap(sns.color_palette(colours))
        venn.venn(dset, cmap=cmap, fontsize=6, figsize=(2, 2))
    else:
        venn3(gene_sets, set_labels=labels)
    if save:
        save_fig(output)
    if show:
        plt.show()

fb_df = df_dict[f'DEseq2_CNS_fb_{date}.csv'][df_dict[f'DEseq2_CNS_fb_{date}.csv']['log2FoldChange'] > 0.5]
mb_df = df_dict[f'DEseq2_CNS_mb_{date}.csv'][df_dict[f'DEseq2_CNS_mb_{date}.csv']['log2FoldChange'] > 0.5]
hb_df = df_dict[f'DEseq2_CNS_hb_{date}.csv'][df_dict[f'DEseq2_CNS_hb_{date}.csv']['log2FoldChange'] > 0.5]
sc_df = df_dict[f'DEseq2_CNS_sc_{date}.csv'][df_dict[f'DEseq2_CNS_sc_{date}.csv']['log2FoldChange'] > 0.5]

fb_sig_genes = fb_df[fb_df['padj'] < 0.05][gene_name].values
mb_sig_genes = mb_df[mb_df['padj'] < 0.05][gene_name].values
hb_sig_genes = hb_df[hb_df['padj'] < 0.05][gene_name].values
sc_sig_genes = sc_df[sc_df['padj'] < 0.05][gene_name].values

plot_venn([set(fb_sig_genes), 
           set(mb_sig_genes), set(hb_sig_genes), set(sc_sig_genes)], 
          ['Forebrain WT vs KO', 'Midbrain WT vs KO', 'Hindbrain WT vs KO', 'Spinalcord WT vs KO'],
          'tissue_WTvsKO',  colours=[fb_colour, mb_colour, hb_colour, sc_colour])

plot_venn((set(fb_sig_genes), set(mb_sig_genes), set(sc_sig_genes)),
          ('Forebrain', 'Midbrain', 'Spinal cord'), 'FbMbSc')


"""
---------------------------------------------------------------
            Plot Anterior Venns
---------------------------------------------------------------
"""
e11 = df_dict[f'DEseq2_CNS_anterior_11_{date}.csv'][df_dict[f'DEseq2_CNS_anterior_11_{date}.csv']['log2FoldChange'] > 0.5]
e13 = df_dict[f'DEseq2_CNS_anterior_13_{date}.csv'][df_dict[f'DEseq2_CNS_anterior_13_{date}.csv']['log2FoldChange'] > 0.5]
e15 = df_dict[f'DEseq2_CNS_anterior_15_{date}.csv'][df_dict[f'DEseq2_CNS_anterior_15_{date}.csv']['log2FoldChange'] > 0.5]
e18 = df_dict[f'DEseq2_CNS_anterior_18_{date}.csv'][df_dict[f'DEseq2_CNS_anterior_18_{date}.csv']['log2FoldChange'] > 0.5]

e11_sig_genes = e11[e11['padj'] < 0.05][gene_name].values
e13_sig_genes = e13[e13['padj'] < 0.05][gene_name].values
e15_sig_genes = e15[e15['padj'] < 0.05][gene_name].values
e18_sig_genes = e18[e18['padj'] < 0.05][gene_name].values

plot_venn([set(e11_sig_genes), 
           set(e13_sig_genes), set(e15_sig_genes), set(e18_sig_genes)], 
          ['e11 WT vs KO', 'e13 WT vs KO', 'e15 WT vs KO', 'e18 WT vs KO'],
          'posterior_time_WTvsKO', colours=[e11_colour, e13_colour, e15_colour, e18_colour])

"""
---------------------------------------------------------------
            Plot posterior venn diagrams
---------------------------------------------------------------
"""
e11 = df_dict[f'DEseq2_CNS_posterior_11_{date}.csv'][df_dict[f'DEseq2_CNS_posterior_11_{date}.csv']['log2FoldChange'] > 0.5]
e13 = df_dict[f'DEseq2_CNS_posterior_13_{date}.csv'][df_dict[f'DEseq2_CNS_posterior_13_{date}.csv']['log2FoldChange'] > 0.5]
e15 = df_dict[f'DEseq2_CNS_posterior_15_{date}.csv'][df_dict[f'DEseq2_CNS_posterior_15_{date}.csv']['log2FoldChange'] > 0.5]
e18 = df_dict[f'DEseq2_CNS_posterior_18_{date}.csv'][df_dict[f'DEseq2_CNS_posterior_18_{date}.csv']['log2FoldChange'] > 0.5]

e11_sig_genes = e11[e11['padj'] < 0.05][gene_name].values
e13_sig_genes = e13[e13['padj'] < 0.05][gene_name].values
e15_sig_genes = e15[e15['padj'] < 0.05][gene_name].values
e18_sig_genes = e18[e18['padj'] < 0.05][gene_name].values

plot_venn([set(e11_sig_genes), 
           set(e13_sig_genes), set(e15_sig_genes), set(e18_sig_genes)], 
          ['e11 WT vs KO', 'e13 WT vs KO', 'e15 WT vs KO', 'e18 WT vs KO'],
          'posterior_time_WTvsKO', colours=[e11_colour, e13_colour, e15_colour, e18_colour])


"""
---------------------------------------------------------------
            Print out the number of significant genes
---------------------------------------------------------------
"""
fb_sig_genes = fb_df[fb_df['padj'] < 0.05][gene_name].values
mb_sig_genes = mb_df[mb_df['padj'] < 0.05][gene_name].values
hb_sig_genes = hb_df[hb_df['padj'] < 0.05][gene_name].values
sc_sig_genes = sc_df[sc_df['padj'] < 0.05][gene_name].values
u.dp(["P.adj < 0.05 for SC: KO vs WT", len(sc_sig_genes)])
u.dp(["P.adj < 0.05 for HB: KO vs WT", len(hb_sig_genes)])
u.dp(["P.adj < 0.05 for MB: KO vs WT", len(mb_sig_genes)])
u.dp(["P.adj < 0.05 for FB: KO vs WT", len(fb_sig_genes)])
--------------------------------------------------------------------------------
                       P.adj < 0.05 for SC: KO vs WT	594	                       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                      P.adj < 0.05 for HB: KO vs WT	2256	                       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                      P.adj < 0.05 for MB: KO vs WT	2031	                       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                      P.adj < 0.05 for FB: KO vs WT	2908	                       
--------------------------------------------------------------------------------

7) Make bar chart of all comparisons

Have a look at the comaprisons and just plot the number of DEGs for a summary figure.

In [13]:
"""
---------------------------------------------------------------
            Bar charts for tissue specific DE 
---------------------------------------------------------------
"""
dfs = [df_dict[f'DEseq2_CNS_wt_fb-mb_{date}.csv'], df_dict[f'DEseq2_CNS_ko_fb-mb_{date}.csv'], 
       df_dict[f'DEseq2_CNS_wt_mb-hb_{date}.csv'], df_dict[f'DEseq2_CNS_ko_mb-hb_{date}.csv'],
       df_dict[f'DEseq2_CNS_wt_hb-sc_{date}.csv'], df_dict[f'DEseq2_CNS_ko_hb-sc_{date}.csv'],
       df_dict[f'DEseq2_CNS_wt_mb-sc_{date}.csv'], df_dict[f'DEseq2_CNS_ko_mb-sc_{date}.csv'],
       df_dict[f'DEseq2_CNS_wt_fb-hb_{date}.csv'], df_dict[f'DEseq2_CNS_ko_fb-hb_{date}.csv'],
       df_dict[f'DEseq2_CNS_wt_fb-sc_{date}.csv'], df_dict[f'DEseq2_CNS_ko_fb-sc_{date}.csv'], 
       df_dict[f'DEseq2_CNS_fb_{date}.csv'], 
       df_dict[f'DEseq2_CNS_mb_{date}.csv'], 
       df_dict[f'DEseq2_CNS_hb_{date}.csv'], 
       df_dict[f'DEseq2_CNS_sc_{date}.csv']
      ]

labels = ['Forebrain vs Midbrain WT',
          'Forebrain vs Midbrain KO',
          
          'Midbrain vs Hindbrain WT',
          'Midbrain vs Hindbrain KO',
          
          'Hindbrain vs Spinalcord WT',
          'Hindbrain vs Spinalcord KO',
          
          'Midbrain vs Spinalcord WT',
          'Midbrain vs Spinalcord KO',
          
          'Forebrain vs Hindbrain WT', 
          'Forebrain vs Hindbrain KO',
          
          'Forebrain vs Spinal cord WT',
          'Forebrain vs Spinalcord KO',
          
          'WT vs KO (Forebrain)',
          'WT vs KO (Midbrain)',
          'WT vs KO (Hindbrain)', 
          'WT vs KO (Spinal cord)',
         ]
i = 0
values = []
logfc_cutoff = 0.5

for d in dfs:
    sigD = d[d['padj'] < 0.05]
    sigD = sigD[abs(sigD['log2FoldChange']) > logfc_cutoff]
    num_sig = len(sigD)
    values.append(num_sig)
    print(f'{num_sig}, {labels[i]}')
    i += 1
    
# Plot a simple bar chart.
barchart = Barchart(df, labels, values, order=labels, figsize=(1,1))
barchart.palette=sns.color_palette([wt_colour, ko_colour, wt_colour, ko_colour, wt_colour, ko_colour,
                                   wt_colour, ko_colour, wt_colour, ko_colour, wt_colour, ko_colour, fb_colour, mb_colour, hb_colour, sc_colour])
barchart.figsize=(2,1)
barchart.plot()
save_fig(f'Barchart_numSigs')
u.dp(["Barchart used p.adj cutoff:", 0.05, "\nLogFC cutoff:", logfc_cutoff])
No handles with labels found to put in legend.
3187, Forebrain vs Midbrain WT
1069, Forebrain vs Midbrain KO
1756, Midbrain vs Hindbrain WT
674, Midbrain vs Hindbrain KO
1479, Hindbrain vs Spinalcord WT
806, Hindbrain vs Spinalcord KO
2555, Midbrain vs Spinalcord WT
1532, Midbrain vs Spinalcord KO
4881, Forebrain vs Hindbrain WT
2085, Forebrain vs Hindbrain KO
4771, Forebrain vs Spinal cord WT
2604, Forebrain vs Spinalcord KO
4414, WT vs KO (Forebrain)
2617, WT vs KO (Midbrain)
2677, WT vs KO (Hindbrain)
717, WT vs KO (Spinal cord)
--------------------------------------------------------------------------------
              Barchart used p.adj cutoff:	0.05	
LogFC cutoff:	0.5	              
--------------------------------------------------------------------------------
In [14]:
"""
---------------------------------------------------------------
            Bar charts for time specific DE (anterior)
---------------------------------------------------------------
"""

dfs = [df_dict[f'DEseq2_CNS_anterior_wt_11-13_{date}.csv'], df_dict[f'DEseq2_CNS_anterior_ko_11-13_{date}.csv'], 
       df_dict[f'DEseq2_CNS_anterior_wt_13-15_{date}.csv'], df_dict[f'DEseq2_CNS_anterior_ko_13-15_{date}.csv'], 
       df_dict[f'DEseq2_CNS_anterior_wt_15-18_{date}.csv'], df_dict[f'DEseq2_CNS_anterior_ko_15-18_{date}.csv'], 
       df_dict[f'DEseq2_CNS_anterior_wt_11-15_{date}.csv'], df_dict[f'DEseq2_CNS_anterior_ko_11-15_{date}.csv'], 
       df_dict[f'DEseq2_CNS_anterior_wt_13-18_{date}.csv'], df_dict[f'DEseq2_CNS_anterior_ko_13-18_{date}.csv'], 
       df_dict[f'DEseq2_CNS_anterior_wt_11-18_{date}.csv'], df_dict[f'DEseq2_CNS_anterior_ko_11-18_{date}.csv'], 
       df_dict[f'DEseq2_CNS_anterior_11_{date}.csv'], 
       df_dict[f'DEseq2_CNS_anterior_13_{date}.csv'], 
       df_dict[f'DEseq2_CNS_anterior_15_{date}.csv'], 
       df_dict[f'DEseq2_CNS_anterior_18_{date}.csv']
      ]

labels = ['e11 vs e13 (WT)', 'e11 vs e13 (KO)',
          'e13 vs e15 (WT)', 'e13 vs e15 (KO)',
          'e15 vs e18 (WT)', 'e15 vs e18 (KO)',
          
          'e11 vs e15 (WT)', 'e11 vs e15 (KO)',
          'e13 vs e18 (WT)', 'e13 vs e18 (KO)',
          
          'e11 vs e18 (WT)', 'e11 vs e18 (KO)',

          'WT vs KO (11)', 'WT vs KO (13)', 'WT vs KO (15)','WT vs KO (18)'
         ]
i = 0
values = []
for d in dfs:
    sigD = d[d['padj'] < 0.05]
    sigD = sigD[abs(sigD['log2FoldChange']) > logfc_cutoff]
    num_sig = len(sigD)
    values.append(num_sig)
    print(f'{num_sig}, {labels[i]}')
    i += 1
    
# Plot a simple bar chart.
barchart = Barchart(df, labels, values, title='Temporal genes anterior', order=labels)
barchart.palette=sns.color_palette([wt_colour, ko_colour, wt_colour, ko_colour, wt_colour, ko_colour,
                                   wt_colour, ko_colour, wt_colour, ko_colour, wt_colour, ko_colour, 
                                    e11_colour, e13_colour, e15_colour, e18_colour])

barchart.plot()
save_fig(f'Barchart_anterior_time_numSigs')

u.dp(["Barchart used p.adj cutoff:", 0.05, "\nLogFC cutoff:", logfc_cutoff])
No handles with labels found to put in legend.
7404, e11 vs e13 (WT)
7940, e11 vs e13 (KO)
4381, e13 vs e15 (WT)
3845, e13 vs e15 (KO)
1970, e15 vs e18 (WT)
2547, e15 vs e18 (KO)
8826, e11 vs e15 (WT)
9101, e11 vs e15 (KO)
6690, e13 vs e18 (WT)
6560, e13 vs e18 (KO)
9707, e11 vs e18 (WT)
9940, e11 vs e18 (KO)
79, WT vs KO (11)
3031, WT vs KO (13)
2415, WT vs KO (15)
3794, WT vs KO (18)
--------------------------------------------------------------------------------
              Barchart used p.adj cutoff:	0.05	
LogFC cutoff:	0.5	              
--------------------------------------------------------------------------------
In [15]:
"""
---------------------------------------------------------------
            Bar charts for time specific (anterior & posterior) 
---------------------------------------------------------------
"""

dfs = [
       df_dict[f'DEseq2_CNS_anterior_11_{date}.csv'], 
       df_dict[f'DEseq2_CNS_anterior_13_{date}.csv'], 
       df_dict[f'DEseq2_CNS_anterior_15_{date}.csv'], 
       df_dict[f'DEseq2_CNS_anterior_18_{date}.csv'],
       df_dict[f'DEseq2_CNS_posterior_11_{date}.csv'], 
       df_dict[f'DEseq2_CNS_posterior_13_{date}.csv'], 
       df_dict[f'DEseq2_CNS_posterior_15_{date}.csv'], 
       df_dict[f'DEseq2_CNS_posterior_18_{date}.csv']
       
      ]

labels = [ 
        'A. WT vs KO (11)', 'A. WT vs KO (13)', 'A. WT vs KO (15)','A. WT vs KO (18)',
        'P. WT vs KO (11)', 'P. WT vs KO (13)', 'P. WT vs KO (15)','P. WT vs KO (18)'
         ]
i = 0
values = []
logfc_cutoff = 0.5
for d in dfs:
    sigD = d[d['padj'] < 0.05]
    sigD = sigD[abs(sigD['log2FoldChange']) > logfc_cutoff]
    num_sig = len(sigD)
    values.append(num_sig)
    print(f'{num_sig}, {labels[i]}')
    i += 1
    
# Plot a simple bar chart.
barchart = Barchart(dfs[0], labels, values, title='Temporal genes anterior', order=labels)
barchart.palette=sns.color_palette([e11_colour, e13_colour, e15_colour, e18_colour,
                                    e11_colour, e13_colour, e15_colour, e18_colour])
barchart.plot()
save_fig('Barchart_anterior-posterior_time_numSigs')

u.dp(["Barchart used p.adj cutoff:", 0.05, "\nLogFC cutoff:", logfc_cutoff])
No handles with labels found to put in legend.
79, A. WT vs KO (11)
3031, A. WT vs KO (13)
2415, A. WT vs KO (15)
3794, A. WT vs KO (18)
36, P. WT vs KO (11)
2388, P. WT vs KO (13)
709, P. WT vs KO (15)
1418, P. WT vs KO (18)
--------------------------------------------------------------------------------
              Barchart used p.adj cutoff:	0.05	
LogFC cutoff:	0.5	              
--------------------------------------------------------------------------------
In [17]:
"""
---------------------------------------------------------------
            Bar charts for time specific DE (posterior)
---------------------------------------------------------------
"""

dfs = [df_dict[f'DEseq2_CNS_posterior_wt_11-13_{date}.csv'], df_dict[f'DEseq2_CNS_posterior_ko_11-13_{date}.csv'], 
       df_dict[f'DEseq2_CNS_posterior_wt_13-15_{date}.csv'], df_dict[f'DEseq2_CNS_posterior_ko_13-15_{date}.csv'], 
       df_dict[f'DEseq2_CNS_posterior_wt_15-18_{date}.csv'], df_dict[f'DEseq2_CNS_posterior_ko_15-18_{date}.csv'], 
       df_dict[f'DEseq2_CNS_posterior_wt_11-15_{date}.csv'], df_dict[f'DEseq2_CNS_posterior_ko_11-15_{date}.csv'], 
       df_dict[f'DEseq2_CNS_posterior_wt_13-18_{date}.csv'], df_dict[f'DEseq2_CNS_posterior_ko_13-18_{date}.csv'], 
       df_dict[f'DEseq2_CNS_posterior_wt_11-18_{date}.csv'], df_dict[f'DEseq2_CNS_posterior_ko_11-18_{date}.csv'], 
       df_dict[f'DEseq2_CNS_posterior_11_{date}.csv'], 
       df_dict[f'DEseq2_CNS_posterior_13_{date}.csv'], 
       df_dict[f'DEseq2_CNS_posterior_15_{date}.csv'], 
       df_dict[f'DEseq2_CNS_posterior_18_{date}.csv']
      ]

labels = ['e11 vs e13 (WT)', 'e11 vs e13 (KO)',
          'e13 vs e15 (WT)', 'e13 vs e15 (KO)',
          'e15 vs e18 (WT)', 'e15 vs e18 (KO)',
          
          'e11 vs e15 (WT)', 'e11 vs e15 (KO)',
          'e13 vs e18 (WT)', 'e13 vs e18 (KO)',
          
          'e11 vs e18 (WT)', 'e11 vs e18 (KO)',

          'WT vs KO (11)', 'WT vs KO (13)', 'WT vs KO (15)','WT vs KO (18)'
         ]
i = 0
values = []
logfc_cutoff = 1.0
for d in dfs:
    sigD = d[d['padj'] < 0.05]
    sigD = sigD[abs(sigD['log2FoldChange']) > logfc_cutoff]
    num_sig = len(sigD)
    values.append(num_sig)
    print(f'{num_sig}, {labels[i]}')
    i += 1
    
# Plot a simple bar chart.
barchart = Barchart(df, labels, values, title='Temporal genes posterior', order=labels)
barchart.palette=sns.color_palette([wt_colour, ko_colour, wt_colour, ko_colour, wt_colour, ko_colour,
                                   wt_colour, ko_colour, wt_colour, ko_colour, wt_colour, ko_colour, 
                                    e11_colour, e13_colour, e15_colour, e18_colour])
barchart.plot()
save_fig(f'Barchart_posterior_time_numSigs')
u.dp(["Barchart used p.adj cutoff:", 0.05, "\nLogFC cutoff:", logfc_cutoff])
No handles with labels found to put in legend.
5273, e11 vs e13 (WT)
5198, e11 vs e13 (KO)
1878, e13 vs e15 (WT)
1017, e13 vs e15 (KO)
573, e15 vs e18 (WT)
667, e15 vs e18 (KO)
5535, e11 vs e15 (WT)
6004, e11 vs e15 (KO)
3148, e13 vs e18 (WT)
2290, e13 vs e18 (KO)
6676, e11 vs e18 (WT)
6660, e11 vs e18 (KO)
19, WT vs KO (11)
1062, WT vs KO (13)
433, WT vs KO (15)
551, WT vs KO (18)
--------------------------------------------------------------------------------
              Barchart used p.adj cutoff:	0.05	
LogFC cutoff:	1.0	              
--------------------------------------------------------------------------------

8) Select log2(TMM + 1) as normalisation

We choose the normalisation to be the log2(TMM + 1) as we are keen to look across the genes rather than just run differential expression between the genes. As such, we need to load this add the information from the various differential expression analyses.

We also merge our WT vs KO experiments from before.

In [18]:
"""
--------------------------------------------------------
Choose normalisation method for data & add annoation
--------------------------------------------------------
"""

        
# Read in normalised RNAseq data
tmm = pd.read_csv(f'{r_dir}merged_df_FEATURE_COUNTS_tmm_{date}.csv')

tmm.rename(columns={ tmm.columns[0]: gene_id }, inplace = True)

for c in tmm:
    if c != gene_id:
        tmm[c] = np.log2(tmm[c].values + 1)

df_all = tmm.copy()
df_all[gene_id] = pd.to_numeric(df_all[gene_id])

# Read in each our our DE experiments from DEseq2
fb_df = df_dict[f'DEseq2_CNS_fb_{date}.csv']
mb_df = df_dict[f'DEseq2_CNS_mb_{date}.csv']
hb_df = df_dict[f'DEseq2_CNS_hb_{date}.csv']
sc_df = df_dict[f'DEseq2_CNS_sc_{date}.csv']

a11_df = df_dict[f'DEseq2_CNS_anterior_11_{date}.csv']
a13_df = df_dict[f'DEseq2_CNS_anterior_13_{date}.csv']
a15_df = df_dict[f'DEseq2_CNS_anterior_15_{date}.csv']
a18_df = df_dict[f'DEseq2_CNS_anterior_18_{date}.csv']

p11_df = df_dict[f'DEseq2_CNS_posterior_11_{date}.csv']
p13_df = df_dict[f'DEseq2_CNS_posterior_13_{date}.csv']
p15_df = df_dict[f'DEseq2_CNS_posterior_15_{date}.csv']
p18_df = df_dict[f'DEseq2_CNS_posterior_18_{date}.csv']

# Let's add all our significant results (and make sure we just indicate where they came from)
fb_df[gene_id] = pd.to_numeric(fb_df[gene_id])
df_all = df_all.merge(fb_df, how='left', left_on=gene_id, 
                                       right_on=gene_id, suffixes=('', '_fb'))
mb_df[gene_id] = pd.to_numeric(mb_df[gene_id])
df_all = df_all.merge(mb_df, how='left', left_on=gene_id, 
                                       right_on=gene_id, suffixes=('', '_mb'))
hb_df[gene_id] = pd.to_numeric(hb_df[gene_id])
df_all = df_all.merge(hb_df, how='left', left_on=gene_id, 
                                       right_on=gene_id, suffixes=('', '_hb'))
sc_df[gene_id] = pd.to_numeric(sc_df[gene_id])
df_all = df_all.merge(sc_df, how='left', left_on=gene_id, 
                                       right_on=gene_id, suffixes=('', '_sc'))

# Now we also want to add in the results from the anterior and posterior temporal changes
a11_df[gene_id] = pd.to_numeric(a11_df[gene_id])
df_all = df_all.merge(a11_df, how='left', left_on=gene_id, 
                                       right_on=gene_id, suffixes=('', '_a11'))
a13_df[gene_id] = pd.to_numeric(a13_df[gene_id])
df_all = df_all.merge(a13_df, how='left', left_on=gene_id, 
                                       right_on=gene_id, suffixes=('', '_a13'))
a15_df[gene_id] = pd.to_numeric(a15_df[gene_id])
df_all = df_all.merge(a15_df, how='left', left_on=gene_id, 
                                       right_on=gene_id, suffixes=('', '_a15'))
a18_df[gene_id] = pd.to_numeric(a18_df[gene_id])
df_all = df_all.merge(a18_df, how='left', left_on=gene_id, 
                                       right_on=gene_id, suffixes=('', '_a18'))

p11_df[gene_id] = pd.to_numeric(p11_df[gene_id])
df_all = df_all.merge(p11_df, how='left', left_on=gene_id, 
                                       right_on=gene_id, suffixes=('', '_p11'))
p13_df[gene_id] = pd.to_numeric(p13_df[gene_id])
df_all = df_all.merge(p13_df, how='left', left_on=gene_id, 
                                       right_on=gene_id, suffixes=('', '_p13'))
p15_df[gene_id] = pd.to_numeric(p15_df[gene_id])
df_all = df_all.merge(p15_df, how='left', left_on=gene_id, 
                                       right_on=gene_id, suffixes=('', '_p15'))
p18_df[gene_id] = pd.to_numeric(p18_df[gene_id])
df_all = df_all.merge(p18_df, how='left', left_on=gene_id, 
                                       right_on=gene_id, suffixes=('', '_p18'))

# Rename the padj from forebrain and logfc
df_all.rename(columns={'padj': 'padj_fb', 'log2FoldChange': 'log2FoldChange_fb'}, inplace=True)

# Drop columns that don't have gene names --> leave this out for now.
#df_all = df_all.dropna(subset=[gene_name])

# This will be on p values that weren't in there 
p_cols = [c for c in df_all.columns if 'padj' in c]

# Make them NS i.e. given them a p value of 1.0
df_all[p_cols] = df_all[p_cols].fillna(value=1.0)

# Fill everything else with 0's so we don't have NaNs in our counts etc
df_all = df_all.fillna(value=0)
df_all = df_all.reset_index()  # Ensure the index is reset each time we rejoin everything
u.dp(['Length of dataframe', len(df_all)])

# We don't want duplicate gene IDs so we drop columns with duplicates in these values
no_na_merged = df_all.drop_duplicates()

# Collect entrez IDs
ens_ids = annot_df['ensembl_gene_id'].values
chr_ids = annot_df['chromosome_name'].values
gene_id_to_ens = {}
gene_id_to_chr = {}
for i, g in enumerate(annot_df[gene_id].values):
    if not gene_id_to_ens.get(g):
        gene_id_to_ens[g] = ens_ids[i]
    if not gene_id_to_chr.get(g):
        gene_id_to_chr[g] = chr_ids[i]
ens_ids = []
chrs = []
for g in no_na_merged[gene_id].values:
    ens_ids.append(gene_id_to_ens.get(g))
    chrs.append(gene_id_to_chr.get(g))
# Add ensembl gene id 
no_na_merged['ensembl_gene_id'] = ens_ids
no_na_merged['chrs'] = chrs

# Organise columns a bit nicer
cols = [gene_id, gene_name, 'ensembl_gene_id', 'chrs',
        'wt11fb1', 'wt11fb2', 'wt13fb1', 'wt13fb2', 'wt15fb1', 'wt15fb2', 'wt18fb1', 'wt18fb2',
        'wt11mb1', 'wt11mb2', 'wt13mb1', 'wt13mb2', 'wt15mb1', 'wt15mb2', 'wt18mb1', 'wt18mb2',
        'wt11hb1', 'wt11hb2', 'wt13hb1', 'wt13hb2', 'wt15hb1', 'wt15hb2', 'wt18hb1', 'wt18hb2',
        'wt11sc1', 'wt11sc2', 'wt13sc1', 'wt13sc2', 'wt15sc1', 'wt15sc2', 'wt18sc1', 'wt18sc2',
        
        'ko11fb1', 'ko11fb2', 'ko13fb1', 'ko13fb2', 'ko15fb1', 'ko15fb2', 'ko18fb1', 'ko18fb2',
        'ko11mb1', 'ko11mb2', 'ko13mb1', 'ko13mb2', 'ko15mb1', 'ko15mb2', 'ko18mb1', 'ko18mb2',
        'ko11hb1', 'ko11hb2', 'ko13hb1', 'ko13hb2', 'ko15hb1', 'ko15hb2', 'ko18hb1', 'ko18hb2',
        'ko11sc1', 'ko11sc2', 'ko13sc1', 'ko13sc2', 'ko15sc1', 'ko15sc2', 'ko18sc1', 'ko18sc2',
]

for c in no_na_merged:
    if c not in cols:
        cols.append(c)
        
df_all = no_na_merged[cols]
df_all = df_all.reset_index()

# Ensure the gene name hasn't been messed up 
gene_names = []
pseudo_id = []

for g in df_all[gene_id].values:
    gene_n = gene_id_to_name.get(g)
    gene_names.append(gene_n)
    pseudo_id.append(f'{g}-{gene_n}')
df_all[gene_name] = gene_names

u.dp(['Length of dataframe (no dups reset index)', len(df_all)])
--------------------------------------------------------------------------------
                           Length of dataframe	20900	                           
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                Length of dataframe (no dups reset index)	20900	                
--------------------------------------------------------------------------------

Testing gender of samples

Here we just confirm the gender of the samples based on X and Y reads.

In [22]:
y_chr = df_all[df_all['chrs'] == 'Y']
x_chr = df_all[df_all['chrs'] == 'X']


wt_cols = [c for c in df_all.columns if 'merged' not in c and 'wt' in c]
for w in wt_cols:
    plt.hist(x_chr[w].values)
    plt.title(f'WT X: {w}')
    plt.show()
    
wt_cols = [c for c in df_all.columns if 'merged' not in c and 'wt' in c]
for w in wt_cols:
    plt.hist(y_chr[w].values)
    plt.title(f'WT Y: {w}')
    plt.show() 

    
ko_cols = [c for c in df_all.columns if 'merged' not in c and 'ko' in c]
for w in ko_cols:
    plt.hist(x_chr[w].values)
    plt.title(f'KO X: {w}')
    plt.show()
    
for w in ko_cols:
    plt.hist(y_chr[w].values)
    plt.title(f'KO Y: {w}')
    plt.show() 

9) Visualise the merged data top most significant genes

Here we visualise the top genes by padj in the forebrain and validate that these align with the other results.

We observe the Hox genes as the most diverged.

In [23]:
from matplotlib.colors import ListedColormap
from sklearn.decomposition import PCA
from sklearn.preprocessing import MinMaxScaler
from sciviso import Scatterplot
import umap
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

from sciviso import Vis

# Plot PCA of the forebrain genes i.e. so we see how time separates
def get_cond_colour(c):
    if 'ko' in c:
        return ko_colour
    elif 'wt' in c:
        return wt_colour
    return '#FFFFFF'

def plt_heatmap(df, idxs, title='', plt_sig=True, mark=None, row_colours=None, cond=''):
    # Now let's look at a couple of these, and how similar these are to the other data
    heatmap_df = pd.DataFrame()
    # Select our genes of interest
    if plt_sig:
        ps = df['padj'].values[idxs]
        p_idxs = np.where(ps < 0.05)
        heatmap_df[gene_name] = df[gene_name].values[idxs][p_idxs]
    else:
        heatmap_df[gene_name] = df[gene_name].values[idxs]

    cols = []
    col_colours = []
    cond_colours = []
    tissue_colours = []
    time_colours = []
    for c in df.columns:
        if mark == None:
            if (cond in c) and 'merged' in c:
                cname = c.split('_')[0]
                print(c)
                if plt_sig:
                    heatmap_df[cname] = df[c].values[idxs][p_idxs]
                else:
                    heatmap_df[cname] = df[c].values[idxs]
                cols.append(cname)
                time_colours.append(get_time_colour(c))
                tissue_colours.append(get_tissue_colour(c))
                cond_colours.append(get_cond_colour(c))
        else:
            if mark in c and hist_metric in c:
                if plt_sig:
                    heatmap_df[c] = np.log2(df[c].values[idxs][p_idxs] + 1)
                else:
                    heatmap_df[c] =  np.log2(df[c].values[idxs] + 1)
                    
                time_colours.append(get_time_colour(c))
                tissue_colours.append(get_tissue_colour(c))
                cond_colours.append(get_mark_colour(c))
                cols.append(c)
    col_colours = [cond_colours, tissue_colours, time_colours]
    
    heatmap = Heatmap(heatmap_df, cols, gene_name, linewidths=0.0, cmap="Purples", x_tick_labels=0, 
                      figsize=(2, 1.5), vmin=0, vmax=9,
                      title=f'{title}', cluster_cols=False, cluster_rows=True, 
                     col_colours=col_colours, row_colours=row_colours)
    
    ax = heatmap.plot()
            
    ax.ax_heatmap.tick_params(direction='out', length=2, width=0.5)
    ax.ax_heatmap.tick_params(labelsize=6)
    ax.ax_heatmap.tick_params(axis='x', which='major', pad=2.0)
    ax.ax_heatmap.tick_params(axis='y', which='major', pad=2.0)

    pplot()
    save_fig(title)
    plt.show()

Plot PCA with each of the features

In [33]:
cols, edges = [], []
e13, e13_colours = [], []
e18, e18_colours = [], []
e15, e15_colours = [], []
tissue_colours = []
time_colours = []
cond_colours = []
wt_tissue_colours = []
wt_time_colours = []
wt_cond_colours = []
ko_tissue_colours = []
ko_time_colours = []
ko_cond_colours = []
c_i = 0
wt_pts = []
ko_pts = []
line_styles = []
for c in df_all.columns:
    if ('wt' in c or 'ko' in c) and '11' not in c:
        cols.append(c)
        if 'wt' in c:
            wt_tissue_colours.append(get_tissue_colour(c))
            wt_time_colours.append(get_time_colour(c))
            wt_pts.append(c_i)
        elif 'ko' in c:
            ko_tissue_colours.append(get_tissue_colour(c))
            ko_time_colours.append(get_time_colour(c))
            ko_pts.append(c_i)
        if '13' in c:
            e13.append(c_i)
            e13_colours.append(get_tissue_colour(c))
            tissue_colours.append(get_tissue_colour(c))
            time_colours.append(get_time_colour(c))
            cond_colours.append(get_cond_colour(c))
        elif '15' in c:
            e15.append(c_i)
            e15_colours.append(get_tissue_colour(c))
            tissue_colours.append(get_tissue_colour(c))
            time_colours.append(get_time_colour(c))
            cond_colours.append(get_cond_colour(c))
        elif '18' in c:
            e18.append(c_i)
            e18_colours.append(get_tissue_colour(c))
            # Make the edges the colour of our condition
            edges.append(get_cond_colour(c))
            tissue_colours.append(get_tissue_colour(c))
            time_colours.append(get_time_colour(c))
            cond_colours.append(get_cond_colour(c))
            if 'wt' in c:
                line_styles.append('-')
            else:
                line_styles.append('-')
        c_i += 1

vals = (df_all[cols].values).T #np.log2(df_all[cols].values + 1).T
fb_pca = PCA(n_components=2)
fb_pca_values = fb_pca.fit_transform(vals)
var_ratio = fb_pca.fit(vals).explained_variance_ratio_

"""
---------------------------------------------------------------
            Plot PCA
---------------------------------------------------------------
"""
plt.rcParams['figure.figsize'] = [2, 2]

plt.scatter(fb_pca_values[e13,0], fb_pca_values[e13,1], linestyle=line_styles, c=e13_colours, s=100, marker=">", edgecolors=edges, linewidths=1.5)
plt.scatter(fb_pca_values[e15,0], fb_pca_values[e15,1], linestyle=line_styles, c=e15_colours, s=100, marker="o", edgecolors=edges, linewidths=1.5)
plt.scatter(fb_pca_values[e18,0], fb_pca_values[e18,1], linestyle=line_styles, c=e18_colours, s=100, marker="X", edgecolors=edges, linewidths=1.5)

plt.title(f'PCA VAR: 0: {var_ratio[0]}, 1: {var_ratio[1]}')
save_fig(f'PCA_ne11_fb')
# Now we want to fit everything except the gene IDs 
plt.show()


"""
---------------------------------------------------------------
            PCA coloured by tissue
---------------------------------------------------------------
"""
plt.rcParams['figure.figsize'] = [2, 2]

plt.scatter(fb_pca_values[wt_pts,0], fb_pca_values[wt_pts,1], linestyle=line_styles, c=wt_tissue_colours, s=100, marker="o", edgecolors='black', linewidths=0.5)
plt.scatter(fb_pca_values[ko_pts,0], fb_pca_values[ko_pts,1], linestyle=line_styles, c=ko_tissue_colours, s=100, marker="X", edgecolors='black', linewidths=0.5)

plt.title(f'PCA VAR: 0: {var_ratio[0]}, 1: {var_ratio[1]}')
save_fig(f'PCA_ne11_tissue')
# Now we want to fit everything except the gene IDs 
plt.show()


"""
---------------------------------------------------------------
            PCA coloured by time
---------------------------------------------------------------
"""
plt.rcParams['figure.figsize'] = [2, 2]

plt.scatter(fb_pca_values[wt_pts,0], fb_pca_values[wt_pts,1], linestyle=line_styles, c=wt_time_colours, s=100, marker="o", edgecolors='black', linewidths=0.5)
plt.scatter(fb_pca_values[ko_pts,0], fb_pca_values[ko_pts,1], linestyle=line_styles, c=ko_time_colours, s=100, marker="X", edgecolors='black', linewidths=0.5)

plt.title(f'PCA VAR: 0: {var_ratio[0]}, 1: {var_ratio[1]}')
save_fig(f'PCA_ne11_time')
# Now we want to fit everything except the gene IDs 
plt.show()

Merge samples and do sample clustermap

Here we just do a sample clustermap

In [88]:
"""
---------------------------------------------------------------
            Sample clustermap for anterior late stage
---------------------------------------------------------------
"""
# Smooth out the columns in the data frame i.e. for the clones we only put in the mean of the two replicates
log2_df = pd.DataFrame()
cols_to_merge = [c for c in df_all.columns if 'wt' in c or 'ko' in c]
col_names = []
col_values = []
i = 0
tissue_colours = []
cond_colours = []
time_colours = []
while(i < len(cols_to_merge)):
    if ('mb' in cols_to_merge[i] or 'fb' in cols_to_merge[i]) and ('15' in cols_to_merge[i]  or '18' in cols_to_merge[i]):
        log2_df[cols_to_merge[i][:-1]] = 0.5 * ((df_all[cols_to_merge[i]].values + 1) +
                                                                    df_all[cols_to_merge[i + 1]].values)
        tissue_colours.append(get_tissue_colour(cols_to_merge[i]))
        time_colours.append(get_time_colour(cols_to_merge[i]))
        cond_colours.append(get_cond_colour(cols_to_merge[i]))

        print("merged", cols_to_merge[i], cols_to_merge[i + 1])
    i += 2
    
    
plt.rcParams['figure.figsize'] = [3, 3]

row_colors_t = [cond_colours, tissue_colours, time_colours]
corr = log2_df.corr()
sns.clustermap(corr, 
                    xticklabels=corr.columns.values,
                    yticklabels=corr.columns.values, cmap='RdBu_r', row_cluster=True, 
                    col_cluster=True, row_colors=row_colors_t)
save_fig(f'Heatmap_sample_cluster')
plt.show()

"""
---------------------------------------------------------------
            Sample clustermap for all
---------------------------------------------------------------
"""
# Smooth out the columns in the data frame i.e. for the clones we only put in the mean of the two replicates
log2_df = pd.DataFrame()
cols_to_merge = [c for c in df_all.columns if 'wt' in c or 'ko' in c]
col_names = []
col_values = []
i = 0
tissue_colours = []
cond_colours = []
time_colours = []
while(i < len(cols_to_merge)):
    log2_df[cols_to_merge[i][:-1]] = 0.5 * ((df_all[cols_to_merge[i]].values + 1) +
                                                                df_all[cols_to_merge[i + 1]].values)
    tissue_colours.append(get_tissue_colour(cols_to_merge[i]))
    time_colours.append(get_time_colour(cols_to_merge[i]))
    cond_colours.append(get_cond_colour(cols_to_merge[i]))

    print("merged", cols_to_merge[i], cols_to_merge[i + 1])
    i += 2
    
    
plt.rcParams['figure.figsize'] = [3, 3]

row_colors_t = [cond_colours, tissue_colours, time_colours]
corr = log2_df.corr()
sns.clustermap(corr, 
                    xticklabels=corr.columns.values,
                    yticklabels=corr.columns.values, cmap='RdBu_r', row_cluster=True, 
                    col_cluster=True, row_colors=row_colors_t)
save_fig(f'Heatmap_sample_cluster_all')


"""
---------------------------------------------------------------
            Sample clustermap for E11
---------------------------------------------------------------
"""
# Smooth out the columns in the data frame i.e. for the clones we only put in the mean of the two replicates
log2_df = pd.DataFrame()
cols_to_merge = [c for c in df_all.columns if 'wt' in c or 'ko' in c]
col_names = []
col_values = []
i = 0
tissue_colours = []
cond_colours = []
time_colours = []
while(i < len(cols_to_merge)):
    if '11' in cols_to_merge[i]:
        log2_df[cols_to_merge[i][:-1]] = 0.5 * ((df_all[cols_to_merge[i]].values + 1) +
                                                                    df_all[cols_to_merge[i + 1]].values)
        tissue_colours.append(get_tissue_colour(cols_to_merge[i]))
        time_colours.append(get_time_colour(cols_to_merge[i]))
        cond_colours.append(get_cond_colour(cols_to_merge[i]))

        print("merged", cols_to_merge[i], cols_to_merge[i + 1])
    i += 2
    
    
plt.rcParams['figure.figsize'] = [3, 3]

row_colors_t = [cond_colours, tissue_colours, time_colours]
corr = log2_df.corr()
sns.clustermap(corr, 
                    xticklabels=corr.columns.values,
                    yticklabels=corr.columns.values, cmap='RdBu_r', row_cluster=True, 
                    col_cluster=True, row_colors=row_colors_t)
save_fig(f'Heatmap_sample_cluster_Ell')


"""
---------------------------------------------------------------
            Sample clustermap for E13
---------------------------------------------------------------
"""
# Smooth out the columns in the data frame i.e. for the clones we only put in the mean of the two replicates
log2_df = pd.DataFrame()
cols_to_merge = [c for c in df_all.columns if 'wt' in c or 'ko' in c]
col_names = []
col_values = []
i = 0
tissue_colours = []
cond_colours = []
time_colours = []
while(i < len(cols_to_merge)):
    if '13' in cols_to_merge[i]:
        log2_df[cols_to_merge[i][:-1]] = 0.5 * ((df_all[cols_to_merge[i]].values + 1) +
                                                                    df_all[cols_to_merge[i + 1]].values)
        tissue_colours.append(get_tissue_colour(cols_to_merge[i]))
        time_colours.append(get_time_colour(cols_to_merge[i]))
        cond_colours.append(get_cond_colour(cols_to_merge[i]))

        print("merged", cols_to_merge[i], cols_to_merge[i + 1])
    i += 2
    
    
plt.rcParams['figure.figsize'] = [3, 3]

row_colors_t = [cond_colours, tissue_colours, time_colours]
corr = log2_df.corr()
sns.clustermap(corr, 
                    xticklabels=corr.columns.values,
                    yticklabels=corr.columns.values, cmap='RdBu_r', row_cluster=True, 
                    col_cluster=True, row_colors=row_colors_t)
save_fig(f'Heatmap_sample_cluster_El3')


"""
---------------------------------------------------------------
            Sample clustermap for E15
---------------------------------------------------------------
"""
# Smooth out the columns in the data frame i.e. for the clones we only put in the mean of the two replicates
log2_df = pd.DataFrame()
cols_to_merge = [c for c in df_all.columns if 'wt' in c or 'ko' in c]
col_names = []
col_values = []
i = 0
tissue_colours = []
cond_colours = []
time_colours = []
while(i < len(cols_to_merge)):
    if '15' in cols_to_merge[i]:
        log2_df[cols_to_merge[i][:-1]] = 0.5 * ((df_all[cols_to_merge[i]].values + 1) +
                                                                    df_all[cols_to_merge[i + 1]].values)
        tissue_colours.append(get_tissue_colour(cols_to_merge[i]))
        time_colours.append(get_time_colour(cols_to_merge[i]))
        cond_colours.append(get_cond_colour(cols_to_merge[i]))

        print("merged", cols_to_merge[i], cols_to_merge[i + 1])
    i += 2
    
    
plt.rcParams['figure.figsize'] = [3, 3]

row_colors_t = [cond_colours, tissue_colours, time_colours]
corr = log2_df.corr()
sns.clustermap(corr, 
                    xticklabels=corr.columns.values,
                    yticklabels=corr.columns.values, cmap='RdBu_r', row_cluster=True, 
                    col_cluster=True, row_colors=row_colors_t)
save_fig(f'Heatmap_sample_cluster_El5')


"""
---------------------------------------------------------------
            Sample clustermap for E18
---------------------------------------------------------------
"""
# Smooth out the columns in the data frame i.e. for the clones we only put in the mean of the two replicates
log2_df = pd.DataFrame()
cols_to_merge = [c for c in df_all.columns if 'wt' in c or 'ko' in c]
col_names = []
col_values = []
i = 0
tissue_colours = []
cond_colours = []
time_colours = []
while(i < len(cols_to_merge)):
    if '15' in cols_to_merge[i]:
        log2_df[cols_to_merge[i][:-1]] = 0.5 * ((df_all[cols_to_merge[i]].values + 1) +
                                                                    df_all[cols_to_merge[i + 1]].values)
        tissue_colours.append(get_tissue_colour(cols_to_merge[i]))
        time_colours.append(get_time_colour(cols_to_merge[i]))
        cond_colours.append(get_cond_colour(cols_to_merge[i]))

        print("merged", cols_to_merge[i], cols_to_merge[i + 1])
    i += 2
    
plt.rcParams['figure.figsize'] = [3, 3]

row_colors_t = [cond_colours, tissue_colours, time_colours]
corr = log2_df.corr()
sns.clustermap(corr, 
                    xticklabels=corr.columns.values,
                    yticklabels=corr.columns.values, cmap='RdBu_r', row_cluster=True, 
                    col_cluster=True, row_colors=row_colors_t)
save_fig(f'Heatmap_sample_cluster_El8')
merged wt15fb1 wt15fb2
merged wt18fb1 wt18fb2
merged wt15mb1 wt15mb2
merged wt18mb1 wt18mb2
merged ko15fb1 ko15fb2
merged ko18fb1 ko18fb2
merged ko15mb1 ko15mb2
merged ko18mb1 ko18mb2
merged wt11fb1 wt11fb2
merged wt13fb1 wt13fb2
merged wt15fb1 wt15fb2
merged wt18fb1 wt18fb2
merged wt11mb1 wt11mb2
merged wt13mb1 wt13mb2
merged wt15mb1 wt15mb2
merged wt18mb1 wt18mb2
merged wt11hb1 wt11hb2
merged wt13hb1 wt13hb2
merged wt15hb1 wt15hb2
merged wt18hb1 wt18hb2
merged wt11sc1 wt11sc2
merged wt13sc1 wt13sc2
merged wt15sc1 wt15sc2
merged wt18sc1 wt18sc2
merged ko11fb1 ko11fb2
merged ko13fb1 ko13fb2
merged ko15fb1 ko15fb2
merged ko18fb1 ko18fb2
merged ko11mb1 ko11mb2
merged ko13mb1 ko13mb2
merged ko15mb1 ko15mb2
merged ko18mb1 ko18mb2
merged ko11hb1 ko11hb2
merged ko13hb1 ko13hb2
merged ko15hb1 ko15hb2
merged ko18hb1 ko18hb2
merged ko11sc1 ko11sc2
merged ko13sc1 ko13sc2
merged ko15sc1 ko15sc2
merged ko18sc1 ko18sc2
merged wt11fb1 wt11fb2
merged wt11mb1 wt11mb2
merged wt11hb1 wt11hb2
merged wt11sc1 wt11sc2
merged ko11fb1 ko11fb2
merged ko11mb1 ko11mb2
merged ko11hb1 ko11hb2
merged ko11sc1 ko11sc2
merged wt13fb1 wt13fb2
merged wt13mb1 wt13mb2
merged wt13hb1 wt13hb2
merged wt13sc1 wt13sc2
merged ko13fb1 ko13fb2
merged ko13mb1 ko13mb2
merged ko13hb1 ko13hb2
merged ko13sc1 ko13sc2
merged wt15fb1 wt15fb2
merged wt15mb1 wt15mb2
merged wt15hb1 wt15hb2
merged wt15sc1 wt15sc2
merged ko15fb1 ko15fb2
merged ko15mb1 ko15mb2
merged ko15hb1 ko15hb2
merged ko15sc1 ko15sc2
merged wt15fb1 wt15fb2
merged wt15mb1 wt15mb2
merged wt15hb1 wt15hb2
merged wt15sc1 wt15sc2
merged ko15fb1 ko15fb2
merged ko15mb1 ko15mb2
merged ko15hb1 ko15hb2
merged ko15sc1 ko15sc2

Plot heatmaps of samples RNAseq

Plot the top and bottom of the heatmaps of each of the DE analyses.

In [34]:
"""
---------------------------------------------------------------
            Select top 20 DE genes (by padj in Forebrain)
---------------------------------------------------------------
"""
n_genes = 10
n_g_neg = -10
# Select the locfc that is the forebrain WT vs KO
merged_df = df_all.copy()

cols_to_merge = [c for c in merged_df.columns if 'wt' in c or 'ko' in c]
col_names = []
col_values = []
values = merged_df[cols_to_merge].values
i = 0
while(i < len(cols_to_merge)):
    merged_df[f'{cols_to_merge[i][:-1]}_merged-rep'] = 0.5 * (merged_df[cols_to_merge[i]].values +
                                                                merged_df[cols_to_merge[i + 1]].values)
    print("merged", cols_to_merge[i], cols_to_merge[i + 1])
    i += 2

log2FoldChange = merged_df['stat'].values
selected_idxs = list(np.argpartition(log2FoldChange, n_genes)[:n_genes]) # np.argpartition(x, -k)[-k:]

col_colours = ['#6666ff'] * n_genes + ['#990033'] * n_genes

"""
---------------------------------------------------------------
            Plot heatmap
---------------------------------------------------------------
"""

plt_heatmap(merged_df, selected_idxs, f'Top genes', False, row_colours=None, cond="")
plt.show()

selected_idxs = list(np.argpartition(log2FoldChange, n_g_neg)[n_g_neg:])
plt_heatmap(merged_df, selected_idxs, f'Bottom genes', False, row_colours=None, cond="")
plt.show()

"""
---------------------------------------------------------------
            Setup PCA
---------------------------------------------------------------
"""
cols, edges = [], []
e13, e13_colours = [], []
e18, e18_colours = [], []
e15, e15_colours = [], []
c_i = 0
line_styles = []
for c in df_all.columns:
    if ('wt' in c or 'ko' in c) and '11' not in c:
        cols.append(c)
        if '13' in c:
            e13.append(c_i)
            e13_colours.append(get_tissue_colour(c))
        elif '15' in c:
            e15.append(c_i)
            e15_colours.append(get_tissue_colour(c))
        elif '18' in c:
            e18.append(c_i)
            e18_colours.append(get_tissue_colour(c))
            # Make the edges the colour of our condition
            edges.append(get_cond_colour(c))
            if 'wt' in c:
                line_styles.append('-')
            else:
                line_styles.append('-')
        c_i += 1

vals = np.log2(df_all[cols].values + 1).T
fb_pca = PCA(n_components=2)
fb_pca_values = fb_pca.fit_transform(vals)
var_ratio = fb_pca.fit(vals).explained_variance_ratio_

"""
---------------------------------------------------------------
            Plot PCA
---------------------------------------------------------------
"""
plt.rcParams['figure.figsize'] = [2, 2]

plt.scatter(fb_pca_values[e13,0], fb_pca_values[e13,1], linestyle=line_styles, c=e13_colours, s=100, marker=">", edgecolors=edges, linewidths=1.5)
plt.scatter(fb_pca_values[e15,0], fb_pca_values[e15,1], linestyle=line_styles, c=e15_colours, s=100, marker="o", edgecolors=edges, linewidths=1.5)
plt.scatter(fb_pca_values[e18,0], fb_pca_values[e18,1], linestyle=line_styles, c=e18_colours, s=100, marker="X", edgecolors=edges, linewidths=1.5)

plt.title(f'PCA VAR: 0: {var_ratio[0]}, 1: {var_ratio[1]}')
save_fig(f'PCA_ne11_fb')
# Now we want to fit everything except the gene IDs 
plt.show()
merged wt11fb1 wt11fb2
merged wt13fb1 wt13fb2
merged wt15fb1 wt15fb2
merged wt18fb1 wt18fb2
merged wt11mb1 wt11mb2
merged wt13mb1 wt13mb2
merged wt15mb1 wt15mb2
merged wt18mb1 wt18mb2
merged wt11hb1 wt11hb2
merged wt13hb1 wt13hb2
merged wt15hb1 wt15hb2
merged wt18hb1 wt18hb2
merged wt11sc1 wt11sc2
merged wt13sc1 wt13sc2
merged wt15sc1 wt15sc2
merged wt18sc1 wt18sc2
merged ko11fb1 ko11fb2
merged ko13fb1 ko13fb2
merged ko15fb1 ko15fb2
merged ko18fb1 ko18fb2
merged ko11mb1 ko11mb2
merged ko13mb1 ko13mb2
merged ko15mb1 ko15mb2
merged ko18mb1 ko18mb2
merged ko11hb1 ko11hb2
merged ko13hb1 ko13hb2
merged ko15hb1 ko15hb2
merged ko18hb1 ko18hb2
merged ko11sc1 ko11sc2
merged ko13sc1 ko13sc2
merged ko15sc1 ko15sc2
merged ko18sc1 ko18sc2
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
In [35]:
"""
---------------------------------------------------------------
            Select top 20 DE genes (by padj in Forebrain)
---------------------------------------------------------------
"""
n_genes = 10
n_g_neg = -10
val = 'stat'
# Select the locfc that is the forebrain WT vs KO
for cond in ['a11', 'a13', 'a15', 'a18', 'p11', 'p13', 'p15', 'p18']:
    e18_gene_df = merged_df.copy()

    log2FoldChange = e18_gene_df[f'{val}_{cond}'].values
    selected_idxs = list(np.argpartition(log2FoldChange, n_genes)[:n_genes]) # np.argpartition(x, -k)[-k:]


    col_colours = ['#6666ff'] * n_genes + ['#990033'] * n_genes

    """
    ---------------------------------------------------------------
                Plot heatmap
    ---------------------------------------------------------------
    """

    plt_heatmap(merged_df, selected_idxs, f'Downregulated genes {cond}', False, row_colours=None)
    plt.show()

    log2FoldChange = e18_gene_df[f'log2FoldChange_{cond}'].values
    selected_idxs = list(np.argpartition(log2FoldChange, n_g_neg)[n_g_neg:])
    plt_heatmap(merged_df, selected_idxs, f'Upregulated genes {cond}', False, row_colours=None)
    plt.show()
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
In [36]:
"""
---------------------------------------------------------------
            Select top 20 DE genes (by padj in Forebrain)
---------------------------------------------------------------
"""
n_genes = 10
n_g_neg = -10
# Select the locfc that is the forebrain WT vs KO
e18_gene_df = merged_df.copy()

log2FoldChange = e18_gene_df['stat_p18'].values
selected_idxs = list(np.argpartition(log2FoldChange, n_genes)[:n_genes]) # np.argpartition(x, -k)[-k:]


col_colours = ['#6666ff'] * n_genes + ['#990033'] * n_genes

"""
---------------------------------------------------------------
            Plot heatmap
---------------------------------------------------------------
"""

plt_heatmap(merged_df, selected_idxs, f'Downregulated genes HB-SC e18', False, row_colours=None)
plt.show()

log2FoldChange = e18_gene_df['log2FoldChange_p18'].values
selected_idxs = list(np.argpartition(log2FoldChange, n_g_neg)[n_g_neg:])
plt_heatmap(merged_df, selected_idxs, f'Upregulated genes HB-SC e18', False, row_colours=None)
plt.show()
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
In [37]:
"""
---------------------------------------------------------------
            Select top 20 DE genes (by padj in Forebrain)
---------------------------------------------------------------
"""
n_genes = 10
n_g_neg = -10
# Select the locfc that is the forebrain WT vs KO
e18_gene_df = merged_df.copy()

log2FoldChange = e18_gene_df['log2FoldChange_sc'].values
selected_idxs = list(np.argpartition(log2FoldChange, n_genes)[:n_genes]) # np.argpartition(x, -k)[-k:]


col_colours = ['#6666ff'] * n_genes + ['#990033'] * n_genes

"""
---------------------------------------------------------------
            Plot heatmap
---------------------------------------------------------------
"""

plt_heatmap(merged_df, selected_idxs, f'Downregulated genes SC', False, row_colours=None)
plt.show()

log2FoldChange = e18_gene_df['log2FoldChange_sc'].values
selected_idxs = list(np.argpartition(log2FoldChange, n_g_neg)[n_g_neg:])
plt_heatmap(merged_df, selected_idxs, f'Upregulaed genes SC', False, row_colours=None)
plt.show()
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
In [38]:
"""
---------------------------------------------------------------
            Select top 20 DE genes (by padj in Forebrain)
---------------------------------------------------------------
"""
n_genes = 10
n_g_neg = -10
# Select the locfc that is the forebrain WT vs KO
e18_gene_df = merged_df.copy()

log2FoldChange = e18_gene_df['log2FoldChange_hb'].values
selected_idxs = list(np.argpartition(log2FoldChange, n_genes)[:n_genes]) # np.argpartition(x, -k)[-k:]


col_colours = ['#6666ff'] * n_genes + ['#990033'] * n_genes

"""
---------------------------------------------------------------
            Plot heatmap
---------------------------------------------------------------
"""

plt_heatmap(merged_df, selected_idxs, f'Downregulated genes HB', False, row_colours=None)
plt.show()

log2FoldChange = e18_gene_df['log2FoldChange_hb'].values
selected_idxs = list(np.argpartition(log2FoldChange, n_g_neg)[n_g_neg:])
plt_heatmap(merged_df, selected_idxs, f'Upregulaed genes HB', False, row_colours=None)
plt.show()
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
In [39]:
"""
---------------------------------------------------------------
            Select top 20 DE genes (by padj in Forebrain)
---------------------------------------------------------------
"""
n_genes = 10
n_g_neg = -10
# Select the locfc that is the forebrain WT vs KO
e18_gene_df = merged_df.copy()

log2FoldChange = e18_gene_df['log2FoldChange_mb'].values
selected_idxs = list(np.argpartition(log2FoldChange, n_genes)[:n_genes]) # np.argpartition(x, -k)[-k:]


col_colours = ['#6666ff'] * n_genes + ['#990033'] * n_genes

"""
---------------------------------------------------------------
            Plot heatmap
---------------------------------------------------------------
"""

plt_heatmap(merged_df, selected_idxs, f'Downregulated genes MB', False, row_colours=None)
plt.show()

log2FoldChange = e18_gene_df['log2FoldChange_mb'].values
selected_idxs = list(np.argpartition(log2FoldChange, n_g_neg)[n_g_neg:])
plt_heatmap(merged_df, selected_idxs, f'Upregulaed genes MB', False, row_colours=None)
plt.show()
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep

Heatmaps of marker genes in WT and KO

In [40]:
gene_markers_sep = [['Foxg1', 'Tbr1', 'Emx1', 'Eomes'], 
                    ['Bhlhe23', 'En2', 'Fev', 'Phox2b'], 
                    ['Hoxb9', 'Hoxc9', 'Hoxd9', 'Hoxc12'],
                    ['Hoxd8', 'Hoxd9', 'Hoxd10', 'Hoxd11', 'Hoxd12', 'Hoxd13', 'Hoxa7', 'Hoxa9', 'Hoxa10', 'Hoxa11', 
                    'Hoxa13', 'Hoxb9', 'Hoxb13',  'Hoxc8', 'Hoxc9', 'Hoxc10', 'Hoxc11', 'Hoxc12', 'Hoxc13'],
                    ['Ccna1', 'Ccna2', 'Ccnd1', 'Ccnd2', 'Ccnd3', 'Ccne1', 'Ccne2', 'Cdc25a', 
                    'Cdc25b', 'Cdc25c', 'E2f1', 'E2f2', 'E2f3', 'Mcm10', 'Mcm5', 'Mcm3', 'Mcm2', 'Cip2a'],
                    ['Cdkn1a', 'Cdkn1b', 'Cdkn1c', 'Cdkn2a', 'Cdkn2b', 'Cdkn2c', 'Cdkn2d'],
                    ['Sox1', 'Sox2', 'Sox3'],
                    ['Snap25', 'Syt1', 'Slc32a1','Slc17a6', 'Syn1'],
                    ['Cspg4', 'Aqp4', 'Slc6a11', 'Olig1', 'Igfbp3']
                    ]

marker_labels_sep = ['Forebrain', 'Midbrain', 'Hindbrain',  'Spinalcord', 
                     'Proliferation', 'Neg. reg. Cell Cycle', 
                    'Progenitors', 'Neurons', 'Glia'
                    ]
In [41]:
for i, genes in enumerate(gene_markers_sep):
    selected_idxs = []
    for j, gene in enumerate(df_all[gene_name].values):
        if gene in genes:
            selected_idxs.append(j)
    plt_heatmap(merged_df, selected_idxs, marker_labels_sep[i] + " genes WT", False, row_colours=None, cond='wt')
    plt_heatmap(merged_df, selected_idxs, marker_labels_sep[i] + " genes KO", False, row_colours=None, cond='ko')

    plt.show()
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep
wt11fb_merged-rep
wt13fb_merged-rep
wt15fb_merged-rep
wt18fb_merged-rep
wt11mb_merged-rep
wt13mb_merged-rep
wt15mb_merged-rep
wt18mb_merged-rep
wt11hb_merged-rep
wt13hb_merged-rep
wt15hb_merged-rep
wt18hb_merged-rep
wt11sc_merged-rep
wt13sc_merged-rep
wt15sc_merged-rep
wt18sc_merged-rep
ko11fb_merged-rep
ko13fb_merged-rep
ko15fb_merged-rep
ko18fb_merged-rep
ko11mb_merged-rep
ko13mb_merged-rep
ko15mb_merged-rep
ko18mb_merged-rep
ko11hb_merged-rep
ko13hb_merged-rep
ko15hb_merged-rep
ko18hb_merged-rep
ko11sc_merged-rep
ko13sc_merged-rep
ko15sc_merged-rep
ko18sc_merged-rep

10) Add Epigenetic data from Encode

We downloaded and processed the data according to the notebook DownloadEncode. Now we want to merge the peak files with our RNAseq data.

Here we keep the signal with the highest value as multiple may have been added.

Keeping in mind, we have potentially the same peak assigned to many genes. This is particularly evident in the H3K27me3 broad peak data.

First we download an annotation from We use sciloc2gene v1.0.0 to annotate this information.

Note you may need to check the reference used for the peak files (this was not needed since we also used Ensembl to map our gene's and find their location).

Need to download the annotation from: https://www.encodeproject.org/references/ENCSR425FOI/ ENCFF871VGROpen file informationDownload gtf genome reference mm10 M21

Convert the GTF to a bed and then sort the bed file (convert using gtf2bed v0.11.0 downloaded frmo https://github.com/fls-bioinformatics-core/GFFUtils/tree/master) gtf2bed ENCFF871VGR.gtf > ENCFF871VGR_mm10-m21_encode.bed

Then we convert to a CSV file and sort it by TSS.

def run_parallel(self):
    data_dir = 'data/encode/histone_modifications/beds_cns/encode/sorted_bed/'
    files = os.listdir(data_dir)
    files_to_run = []
    for filename in files:
        files_to_run.append(filename)
    # Run in paralell
    pool = ThreadPool(12)
    results = pool.map(run_bed, files_to_run)


def run_bed(filename):
    base_dir = 'data/encode/histone_modifications/beds_cns/encode/'
    data_dir = f'{base_dir}sorted_bed/'
    output_dir = f'{base_dir}scie2g_28102020/'
    overlap_method = 'in_promoter'
    if 'H3K9me3' in filename or 'H3K36me3' in filename:
        overlap_method = 'overlaps'
    #ensembl_gene_id, entrezgene_id, external_gene_name, chromosome_name, start_position, end_position, strand
    bed = Bed(f'{data_dir}{filename}', overlap_method=overlap_method,
              output_bed_file=f'{output_dir}selected_peaks/{filename}',
              buffer_after_tss=1500,
              buffer_before_tss=5000,
              buffer_gene_overlap=1500,
              chr_idx=0, start_idx=1,
              end_idx=2, peak_value=6, header_extra='8,9',
              header="chr,start,end,signal,qvalue,peak",
              gene_start=4, gene_end=5, gene_direction=6, gene_chr=3, gene_name=0
              )
    # Use your annotation file i.e. for humans
    bed.set_annotation_from_file('../supps/mm10Sorted_mmusculus_gene_ensembl-GRCm38.p6.csv')
    # Now we can run the assign values
    bed.assign_locations_to_genes()
    bed.save_loc_to_csv(f'{output_dir}parsed_files/{filename[:-3]}csv', keep_unassigned=True)
In [42]:
"""
--------------------------------------------------------
Add Epigenetic data to the RNAseq dataframe
--------------------------------------------------------
"""

%matplotlib inline

def add_peak_file(df, filename):
    # Make a dictionary from the key being the gene id to the value (width)
    mapping = {}
    mapping_signals = {}

    file_df = pd.read_csv(f'{chip_dir}{filename}')
    
    # Keep the max value
    signals = file_df['signal'].values
    widths = file_df['width'].values
    qvalues = file_df['qvalue'].values # q-value at position 9 i.e. 9 - 1 = 8
    
    # Only keep peaks that were significant i.e. qvalue > -1 * log10(0.05) ~ 1.3
    for i, g in enumerate(file_df[gene_id].values):
        qvalue = qvalues[i]
        if qvalue > 1.3:
            if not mapping.get(g):
                mapping[g] = widths[i]
                mapping_signals[g] = signals[i]
            else:
                mapping[g] = widths[i] if widths[i] > mapping[g] else mapping[g]
                # Keep the signal that corresponds to the widest peak
                mapping_signals[g] = signals[i] if widths[i] > mapping[g] else mapping_signals[g]
    # Now we want to iterate through the genes in the df and add them
    widths_ordered = []
    signals_ordered = []
    fname = filename[:-4]
    for g in df[gene_id].values:
        widths_ordered.append(mapping.get(g))
        signals_ordered.append(mapping_signals.get(g))
        
    df[f'{fname}_width'] = widths_ordered
    df[f'{fname}_signal'] = signals_ordered

    return df

# Now we want to add our chipseq data from Encode
chip_dir = os.path.join(input_dir, f'scie2g_b2500_g500/parsed_files/')
files = os.listdir(chip_dir)

chip_files = []
mark_locations = []
for f in files:
    if '.csv' in f and '.swp' not in f and 'embryonic_' in f:
        chip_files.append(f)

chip_files.sort()

# Add them to the dataframe
dfs = []
for filename in chip_files:
    u.dp(["Adding", filename])
    df_all = add_peak_file(df_all, filename)
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_10.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF003VMR.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_10.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF310NGB.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_10.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF565QAD.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_10.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF053GHW.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_10.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF900BMQ.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_10.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF928QGS.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_11.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF272TNO.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_11.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF705WVF.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_11.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF091CFQ.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_11.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF919NDM.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_11.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF133ICS.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_11.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF100TNO.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_11.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF192VXE.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_11.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF454HBY.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_12.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF754NCW.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_12.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF884KTA.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_12.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF739FEV.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_12.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF457JYV.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_12.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF864INF.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_12.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF306LDH.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_12.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF852GMM.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_12.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF831OMF.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_13.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF247VPI.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_13.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF761GSC.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_13.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF013JOA.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_13.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF215FWT.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_13.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF945PEV.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_13.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF391UBH.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_13.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF866KRC.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_13.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF004JEQ.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_14.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF531BZD.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_14.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF703RVX.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_14.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF012TIE.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_14.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF027YKW.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_14.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF928CNU.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_14.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF303ITL.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_14.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF029UVT.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_14.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF518UQD.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_15.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF384LAY.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_15.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF854XLG.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_15.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF654WPF.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_15.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF067ZBV.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_15.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF377UCX.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_15.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF490SEH.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_15.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF674WXN.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Adding	embryonic-facial-prominence_15.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF331TPD.csv	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_10.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF014HMN.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	forebrain_10.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF115VPJ.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	forebrain_10.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF095AYA.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_10.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF027RGW.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_10.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF187RGN.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_10.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF009ZBN.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_11.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF897EEM.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	forebrain_11.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF592HST.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	forebrain_11.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF524NCS.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_11.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF797KBV.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_11.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF405JZZ.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_11.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF958GPD.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_11.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF554QXT.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_11.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF926WJL.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_12.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF627DHT.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	forebrain_12.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF397HZX.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	forebrain_12.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF744VBB.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_12.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF191DCQ.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_12.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF077LYY.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_12.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF909OUL.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_12.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF572YAQ.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_12.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF207AWO.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_13.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF026QRB.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	forebrain_13.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF076ZNJ.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	forebrain_13.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF106EPT.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_13.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF155VNQ.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_13.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF441UQS.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_13.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF008XLS.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_13.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF607FOX.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_13.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF131ORZ.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_14.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF084IYL.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	forebrain_14.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF591TWD.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	forebrain_14.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF913TLX.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_14.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF898CIL.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_14.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF536OML.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_14.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF929PME.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_14.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF006UBN.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_14.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF711DMP.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_15.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF366XXD.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	forebrain_15.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF082ONF.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	forebrain_15.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF704ZRS.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_15.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF896VGU.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_15.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF799TEJ.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_15.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF615CBT.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_15.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF091BOK.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_15.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF416VVP.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_16.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF378OWA.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	forebrain_16.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF686XPK.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	forebrain_16.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF788EJN.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_16.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF502DRX.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_16.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF208FJV.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_16.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF965GZP.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_16.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF109FJW.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	forebrain_16.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF090SVU.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_10.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF814SUK.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	heart_10.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF671NPI.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	heart_10.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF482VKK.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_10.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF043CBA.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_10.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF076DLF.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_10.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF749BXF.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_11.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF954URD.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	heart_11.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF701UXU.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	heart_11.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF439XWM.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_11.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF737FNO.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_11.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF976DAN.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_11.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF678FCR.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_11.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF785EGA.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_11.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF386JRH.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_12.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF323EFD.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	heart_12.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF982CSM.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	heart_12.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF301XNX.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_12.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF508FVF.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_12.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF219QPV.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_12.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF206VAY.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_12.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF174FSI.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_12.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF839GVV.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_13.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF583IBI.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	heart_13.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF311HQB.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	heart_13.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF924ZOT.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_13.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF649ZZL.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_13.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF454NBK.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_13.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF740QCT.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_13.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF602RZE.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_13.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF798HTO.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_14.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF153SIZ.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_14.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF876UMM.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	heart_14.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF933KMD.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	heart_14.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF407TRS.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_14.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF342XGS.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_14.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF595BPW.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_14.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF425FZB.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_14.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF102NPI.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_14.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF757LMZ.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_14.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF190QIL.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_14.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF850NRY.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_15.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF309CWW.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	heart_15.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF531ZQR.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	heart_15.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF057YDK.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_15.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF813HGL.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_15.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF004HCZ.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_15.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF070XWK.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_15.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF435FFR.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_15.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF951TUZ.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_16.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF153DSU.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	heart_16.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF143OKF.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	heart_16.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF185DUB.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_16.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF953VHG.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_16.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF581WKJ.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_16.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF003WOW.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_16.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF179PHX.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	heart_16.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF801DMR.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_10.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF751ZHP.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	hindbrain_10.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF475JXJ.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	hindbrain_10.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF376OOU.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_10.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF945SSM.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_10.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF432UHX.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_10.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF676DCP.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_11.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF203QTV.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	hindbrain_11.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF809JLW.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	hindbrain_11.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF655QAK.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_11.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF542GAS.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_11.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF171AAD.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_11.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF445DQR.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_11.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF368TGQ.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_11.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF064HMK.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_12.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF409CQX.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	hindbrain_12.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF435HPM.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	hindbrain_12.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF305LVL.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_12.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF631ZAN.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_12.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF836NCO.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_12.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF577GBD.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_12.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF503HPO.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_12.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF724UQO.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_13.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF546IUI.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	hindbrain_13.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF324KBG.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	hindbrain_13.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF137CII.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_13.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF879SPX.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_13.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF186IVQ.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_13.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF945KFR.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_13.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF840MRK.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_13.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF292IDK.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_14.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF157AZQ.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	hindbrain_14.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF658ZBC.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	hindbrain_14.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF730ILG.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_14.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF844YDT.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_14.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF630KEB.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_14.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF729FLS.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_14.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF445SCM.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_14.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF906OVX.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_15.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF877YAM.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	hindbrain_15.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF086VKO.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	hindbrain_15.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF686YHI.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_15.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF524PYT.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_15.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF827TNA.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_15.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF175FGU.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_15.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF758AJN.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_15.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF441CYS.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_16.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF834HBK.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	hindbrain_16.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF787IGD.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	hindbrain_16.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF710QZP.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_16.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF961WJN.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_16.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF063GKX.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_16.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF726YXX.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_16.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF777TKP.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	hindbrain_16.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF134VAJ.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_14.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF354MWY.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	intestine_14.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF389JEN.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	intestine_14.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF554BTF.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_14.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF820FBS.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_14.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF145YIU.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_14.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF854JVF.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_14.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF406MRQ.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_14.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF010EOK.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_15.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF378DZY.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	intestine_15.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF216ZQH.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	intestine_15.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF730PKF.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_15.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF092NWD.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_15.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF701RMC.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_15.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF956QXI.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_15.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF537DMQ.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_15.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF998XVR.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_16.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF514SMO.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	intestine_16.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF937XLK.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	intestine_16.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF248XFL.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_16.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF992ONF.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_16.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF051MHA.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_16.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF645FMD.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_16.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF584IJO.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	intestine_16.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF650XPW.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_14.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF515NHP.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_14.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF201LJE.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_14.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF760WJY.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_14.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF367TAY.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_14.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF101SCV.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_14.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF077SVM.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	kidney_14.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF262CWR.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_14.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF041VPJ.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_15.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF083AAZ.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_15.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF764NLB.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_15.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF013FAX.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_15.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF285JOI.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_15.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF066PLX.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_15.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF207JEH.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	kidney_15.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF529SOW.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_15.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF700ZGY.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_16.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF386RYT.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_16.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF758LFN.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_16.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF417IPS.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_16.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF813WVL.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_16.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF420KHT.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_16.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF585WBI.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	kidney_16.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF620BBH.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	kidney_16.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF712UKV.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_10.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF117LLJ.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_10.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF416APG.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_10.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF297OQP.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_10.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF624RPQ.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_10.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF044UAM.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_11.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF948YFR.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_11.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF201XDM.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_11.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF168TSU.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_11.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF352OIG.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_11.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF696AXB.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_11.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF928TGM.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
        Adding	limb_11.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF151ZNU.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_11.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF368NDM.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_12.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF889PZP.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_12.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF355VUL.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_12.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF426CAN.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_12.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF527QRG.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_12.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF251COU.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_12.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF852KYV.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
        Adding	limb_12.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF482OCW.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_12.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF425DTV.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_13.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF467HTN.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_13.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF718DIQ.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_13.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF425VLJ.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_13.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF320PIN.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_13.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF933MQW.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_13.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF278YZX.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
        Adding	limb_13.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF032CBZ.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_13.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF829KAE.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_14.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF354CRO.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_14.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF320QNM.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_14.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF614NAZ.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_14.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF094WSH.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_14.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF656AZE.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_14.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF070OBB.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
        Adding	limb_14.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF141DMD.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_14.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF816HJK.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_15.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF399AMW.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_15.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF377NNY.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_15.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF851CNF.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_15.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF761MYS.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_15.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF172TDO.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_15.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF771CVP.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
        Adding	limb_15.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF973LJC.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	limb_15.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF305SMO.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_11.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF241YYJ.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	liver_11.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF938UBV.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	liver_11.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF691YBS.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_11.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF263ADI.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_11.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF344VVO.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_11.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF482JBE.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_11.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF343AOE.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_11.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF470CWQ.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_12.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF974RAJ.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	liver_12.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF900AWR.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	liver_12.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF705VGD.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_12.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF034TNU.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_12.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF815RAK.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_12.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF459HSF.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_12.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF598SZM.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_12.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF646CRN.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_13.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF196WCO.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	liver_13.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF449NEX.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	liver_13.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF617EOJ.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_13.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF868KHT.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_13.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF448ZXD.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_13.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF685MXX.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_13.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF515ZTF.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_13.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF825AMV.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_14.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF384FJW.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_14.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF764IFT.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	liver_14.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF624TNH.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	liver_14.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF962XLA.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_14.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF058PEK.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_14.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF396KCO.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_14.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF427EVM.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_14.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF510NON.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_14.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF719DDD.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_14.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF851AGI.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_14.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF248KEE.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_15.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF399KKD.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	liver_15.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF398IQT.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	liver_15.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF706TOV.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_15.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF938YRV.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_15.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF860WOO.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_15.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF588HJN.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_15.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF997COQ.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_15.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF210HCC.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_16.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF470UWO.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	liver_16.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF651SKE.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	liver_16.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF966IMK.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_16.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF317PNJ.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_16.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF039VCT.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_16.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF252RRE.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_16.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF604SXS.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	liver_16.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF968FFE.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_14.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF093QGY.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_14.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF061WET.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_14.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF103ECO.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_14.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF034OKB.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_14.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF063FXI.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_14.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF578SCN.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
        Adding	lung_14.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF938JRH.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_14.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF419VVR.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_15.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF399WYR.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_15.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF842CGS.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_15.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF014WSM.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_15.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF979AUE.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_15.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF778XZE.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_15.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF531LFQ.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
        Adding	lung_15.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF839FLS.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_15.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF346HOE.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_16.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF700WUD.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_16.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF913XPA.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_16.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF782HRY.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_16.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF696ERS.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_16.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF593MNH.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_16.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF246QSQ.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
        Adding	lung_16.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF548SPK.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Adding	lung_16.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF745NLF.csv	        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_10.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF281MKX.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_10.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF942MEP.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_10.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF759LSQ.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_10.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF191NER.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_10.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF110OOK.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_10.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF343IMA.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_11.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF566DFK.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_11.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF041KRQ.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_11.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF856JND.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_11.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF227ZTU.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_11.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF300MTY.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_11.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF651IYR.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	midbrain_11.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF716QVT.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_11.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF828SQE.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_12.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF213XYX.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_12.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF635JKX.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_12.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF275FND.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_12.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF980ELU.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_12.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF448EOM.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_12.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF344OEX.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	midbrain_12.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF029PID.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_12.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF528GDD.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_13.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF579HGD.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_13.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF276UEL.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_13.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF005PXT.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_13.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF097SDI.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_13.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF872YAF.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_13.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF095ZPA.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	midbrain_13.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF736SRQ.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_13.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF052XZG.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_14.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF291VFI.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_14.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF402CBS.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_14.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF182UTI.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_14.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF405TCV.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_14.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF600HWQ.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_14.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF985ZIG.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	midbrain_14.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF288VBM.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_14.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF403PZQ.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_15.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF447XAK.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_15.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF962ZLC.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_15.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF088MSQ.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_15.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF568RKB.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_15.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF293NIL.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_15.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF705XQD.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	midbrain_15.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF268GCA.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_15.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF926DZK.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_16.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF468ZXG.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_16.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF887CCI.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_16.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF310IEN.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_16.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF342RVM.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_16.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF985ACY.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_16.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF190XOL.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	midbrain_16.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF033SQD.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	midbrain_16.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF864AIL.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_11.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF956JDU.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
   Adding	neural-tube_11.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF090NYW.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
   Adding	neural-tube_11.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF567RNX.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_11.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF211AEC.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_11.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF105KTG.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_11.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF232UTI.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_11.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF460EBT.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_11.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF736PWD.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_12.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF434LSI.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
   Adding	neural-tube_12.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF848ZMQ.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
   Adding	neural-tube_12.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF122LDI.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_12.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF994GGO.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_12.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF433YIT.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_12.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF306PYA.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_12.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF946KMT.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_12.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF670WBI.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_13.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF659YSV.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
   Adding	neural-tube_13.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF053PYZ.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
   Adding	neural-tube_13.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF662QMT.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_13.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF327TUT.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_13.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF992ULU.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_13.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF791AVG.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_13.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF276XJH.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_13.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF928WRM.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_14.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF573BYB.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
   Adding	neural-tube_14.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF634XCB.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
   Adding	neural-tube_14.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF815AIK.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_14.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF772IER.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_14.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF089PVO.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_14.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF098IOA.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_14.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF708TRR.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_14.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF339VOX.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_15.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF464KTV.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
   Adding	neural-tube_15.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF432THP.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
   Adding	neural-tube_15.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF842ZMC.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_15.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF832NUL.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_15.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF112TNK.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_15.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF532MDF.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_15.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF571LJI.csv	     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
    Adding	neural-tube_15.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF117MAV.csv	    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_14.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF998WFE.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	stomach_14.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF066RLS.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	stomach_14.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF861TUC.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_14.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF199NMT.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_14.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF375ZVW.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_14.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF793VQQ.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_14.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF355FBI.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_14.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF740CQQ.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_15.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF160HCA.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	stomach_15.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF556VVF.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	stomach_15.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF864IHQ.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_15.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF748YQD.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_15.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF970SVO.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_15.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF878VPM.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_15.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF781VAZ.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_15.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF024CYU.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_16.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF223IHN.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	stomach_16.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF054GDC.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
     Adding	stomach_16.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF868NOS.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_16.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF814BNR.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_16.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF501CJA.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_16.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF569KWB.csv	      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_16.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF068FWP.csv	       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
      Adding	stomach_16.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF544RGQ.csv	      
--------------------------------------------------------------------------------

11) Filter the merged dataframe

Since we have merged so many dataframes there are likely to be many missing values.

We want to first drop any rows that are duplicates & 2) fill in null values with 0's (otherwise we'll get errors when running analyses).

We also want to organise the columns of the dataframe (this makes it easier for charts later on).

Lastly, we are only interested in genes that had a significant change between WT and KO (we want to desipher WHY this occured), so we drop any data with a non-sig result.

Filtering steps:

1) Keep genes if padj < 0.05 in at least one experiment, THEN
2) Keep genes if genes were expressed in WT or KO tissues with at least an average TMM of 2, THEN
3) Keep genes if genes at least one experiment recorded a log fold change of > 1.0

This dataset will be used for our VAE analysis.

In [43]:
# Keep only the significant data i.e. the data where at least one of the p values was < 0.05
genes_to_keep = []
ps = ((1.0 * (df_all['padj_sc'].values < 0.05)) + (1.0 * (df_all['padj_mb'].values < 0.05))
     +  (1.0 * (df_all['padj_hb'].values < 0.05)) + (1.0 * (df_all['padj_fb'].values < 0.05))
     + (1.0 * (df_all['padj_a11'].values < 0.05)) 
     + (1.0 * (df_all['padj_a13'].values < 0.05)) 
     + (1.0 * (df_all['padj_a15'].values < 0.05))
     + (1.0 * (df_all['padj_a18'].values < 0.05)))


df_all['p_mask'] = ps

# Now we also want to make sure that in each of these datasets, we have at least 2 TMM average in either WT or KO
# We don't want the genes that have super low expression since that is not interesting for our downstream analyses
wt_cols = [c for c in df_all.columns if 'wt' in c]
ko_cols = [c for c in df_all.columns if 'ko' in c]
tmm_cutoff = 0.5

expression_mask = ((1.0 * (np.mean(df_all[wt_cols].values, axis=1) >= tmm_cutoff)) + (1.0 * (np.mean(df_all[ko_cols].values, axis=1) >= tmm_cutoff)))
df_all['expression_mask'] = expression_mask

# Make a DF that only has genes that are significant at the 0.05 level
df_sig = df_all[df_all['p_mask'] > 0]

# Make a DF that only has genes that are minimally expressed in at least one condition
df_sig = df_sig[df_sig['expression_mask'] > 0]

u.dp(['Number of significant rows: ', len(df_sig)])

# Let's also make sure that we can keep have genes that have at least an absolute logfc of 0.5
log_str = 'log2FoldChange'
logfc_cols = [c for c in df_sig.columns if log_str in c]
mean_logfc = np.mean(abs(df_sig[logfc_cols].values), axis=1)
# df_sig['mean_logfc'] = mean_logfc
# df_sig = df_sig[df_sig['mean_logfc'] > 0.5]

cutoff = 1.0
logfc = ((1.0 * (abs(df_sig[f'{log_str}_sc'].values) > cutoff)) 
         + (1.0 * (abs(df_sig[f'{log_str}_mb'].values) > cutoff)) 
         + (1.0 * (abs(df_sig[f'{log_str}_hb'].values) > cutoff)) 
         + (1.0 * (abs(df_sig[f'{log_str}_fb'].values) > cutoff)) 
         + (1.0 * (abs(df_sig[f'{log_str}_a11'].values) > cutoff))
         + (1.0 * (abs(df_sig[f'{log_str}_a13'].values) > cutoff)) 
         + (1.0 * (abs(df_sig[f'{log_str}_a15'].values) > cutoff)) 
         + (1.0 * (abs(df_sig[f'{log_str}_a18'].values) > cutoff))
        )

df_sig['logfc_mask'] = logfc
df_sig_2 = df_sig[df_sig['logfc_mask'] > 2]  # Require two of the conditions it to be met
df_sig_1 = df_sig[df_sig['logfc_mask'] > 0]  # Require two of the conditions it to be met

u.dp(['Number of significant rows with an absolute FC > 1.0: ', len(df_sig_1)])
u.dp(['Number of significant rows with 2 values absolute FC > 2.0: ', len(df_sig_2)])

# Save all to dataframes
df_sig_1.to_csv(f'{output_dir}df-sig_1_epi-2500_{date}.csv', index=False)
df_sig_2.to_csv(f'{output_dir}df-consistent_epi-2500_{date}.csv', index=False)
df_sig.to_csv(f'{output_dir}df-significant_epi-2500_{date}.csv', index=False)
df_all.to_csv(f'{output_dir}df-all_epi-2500_{date}.csv', index=False)

# --------------------------------------------------------------------------------
#                       Number of significant rows: 	12532	                       
# --------------------------------------------------------------------------------
# --------------------------------------------------------------------------------
#           Number of significant rows with an absolute FC > 1.0: 	2944	          
# --------------------------------------------------------------------------------
# --------------------------------------------------------------------------------
#        Number of significant rows with 2 values absolute FC > 2.0: 	1928	       
# --------------------------------------------------------------------------------

gene_markers_sep = [['Emx1', 'Eomes', 'Tbr1', 'Foxg1', 'Lhx6'], 
                    ['En1', 'En2', 'Lmx1a', 'Bhlhe23', 'Sall4'], 
                    ['Hoxb1', 'Krox20', 'Fev', 'Hoxd3', 'Phox2b'],
                    ['Hoxd8', 'Hoxd9', 'Hoxd10', 'Hoxd11', 'Hoxd12', 'Hoxd13', 'Hoxa7', 'Hoxa9', 'Hoxa10', 'Hoxa11', 
                    'Hoxa13', 'Hoxb9', 'Hoxb13',  'Hoxc8', 'Hoxc9', 'Hoxc10', 'Hoxc11', 'Hoxc12', 'Hoxc13'],
                    ['Ccna1', 'Ccna2', 'Ccnd1', 'Ccnd2', 'Ccnd3', 'Ccne1', 'Ccne2', 'Cdc25a', 
                    'Cdc25b', 'Cdc25c', 'E2f1', 'E2f2', 'E2f3', 'Mcm10', 'Mcm5', 'Mcm3', 'Mcm2', 'Cip2a'],
                    ['Cdkn1a', 'Cdkn1b', 'Cdkn1c', 'Cdkn2a', 'Cdkn2b', 'Cdkn2c', 'Cdkn2d'],
                    ['Sox2', 'Sox1', 'Sox3', 'Hes1', 'Hes5'],
                    ['Snap25', 'Syt1', 'Slc32a1','Slc17a6', 'Syn1'],
                    ['Cspg4', 'Aqp4', 'Slc6a11', 'Olig1', 'Igfbp3'],
                    ['Foxg1'], 
                    ['En2'], 
                    ['Phox2b'],
                    ['Hoxc9'],
                    ['Sox3']
                    ]
for ml in gene_markers_sep:
    print("\n")
    for m in ml:
        for g in df_sig_1[gene_name].values:
            try:
                if g == m:
                    print(g)
            except:
                x = 0
--------------------------------------------------------------------------------
                      Number of significant rows: 	12797	                       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
          Number of significant rows with an absolute FC > 1.0: 	3148	          
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
       Number of significant rows with 2 values absolute FC > 2.0: 	1371	       
--------------------------------------------------------------------------------


Emx1
Eomes
Tbr1
Foxg1
Lhx6


En1
En2
Lmx1a
Bhlhe23
Sall4


Hoxb1
Fev
Hoxd3
Phox2b


Hoxd8
Hoxd9
Hoxd10
Hoxd11
Hoxd13
Hoxa7
Hoxa9
Hoxa10
Hoxa11
Hoxa13
Hoxb9
Hoxb13
Hoxc8
Hoxc9
Hoxc10
Hoxc11
Hoxc12
Hoxc13


Ccna2
Ccnd1
Ccnd2
Cdc25c
E2f1
E2f2
Mcm10
Mcm5
Mcm3
Mcm2
Cip2a


Cdkn1a
Cdkn2a
Cdkn2b
Cdkn2c


Sox1
Sox3
Hes5


Snap25
Slc17a6


Aqp4
Slc6a11


Foxg1


En2


Phox2b


Hoxc9


Sox3

12) Setup functions for simple analyses

We want to test whether some simple things are enriched or not. i.e. do we see different annotations if we look at the genes that were 1) significant a) with H3K27me3, b) without H3K27me3, 2) non-significant a) with H3K27me3, b) without H3K27me3.

For each of these groups we want to:

1) test for annotation enrichment
2) plot the overal HM profile 
3) plot the overal gene expression profile.
In [44]:
import venn
from matplotlib.colors import ListedColormap

def plot_venn(gene_sets, labels, title, save=True, show=True, colours=None):
    output = f'venn_{title.replace(" ", "-")}'
    dset = {}
    for i, l in enumerate(labels):
        dset[l] = gene_sets[i]
    if len(gene_sets) > 3:
        cmap = ListedColormap(sns.color_palette(colours))
        venn.venn(dset, cmap=cmap, fontsize=6, figsize=(2, 2))
    else:
        venn3(gene_sets, set_labels=labels)
    if save:
        save_fig(output)
    if show:
        plt.show()
In [45]:
from statsmodels.stats.multitest import multipletests
import string
from scipy import stats
from sciviso import Heatmap
import matplotlib 
def run_annot_plot(df_bg, df_fg, title=""):
    changes = []
    order = ['Pr-A', 'Pr-W', 'Pr-B', 'Pr-F', 'En-Sd', 'En-Sp', 
         'En-W', 'En-Pd', 'En-Pp', 'Tr-S', 'Tr-P', 'Tr-I', 
          'Hc-P', 'Hc-H', 'NS']
    total = len(df_bg) * 1.0
    total_sig = len(df_fg) * 1.0
    perc_all = []
    perc_sig = []
    
    for o in order:
        o_all = len(df_bg[df_bg['peak_value'] == o])
        o_sig = len(df_fg[df_fg['peak_value'] == o])
        perc_all.append(o_all)
        perc_sig.append(o_sig)
        if o_all == 0 and o_sig == 0:
            changes.append(0)
        else:
            changes.append(((o_sig) - (o_all)))

    odds_ratios = []
    pvalues = []
    for i, o in enumerate(order):
        # Do a FET on each one
        oddsratio, pvalue = stats.fisher_exact([[perc_sig[i], perc_all[i]], 
                                                [total_sig - perc_sig[i],
                                                 total - perc_all[i]]])

        print(o, oddsratio, pvalue,[[perc_sig[i], perc_all[i]], 
                                                [total_sig - perc_sig[i], 
                                                 total - perc_all[i]]])
        odds_ratios.append(oddsratio)
        pvalues.append(pvalue)

    reg, padj, a, b = multipletests(pvalues, alpha=0.1, 
                                    method='fdr_bh', returnsorted=False)

    p_sigs = []
    for p in padj: 
        if p > 0.05:
            p_sigs.append('')
        elif p <= 0.05 and p > 0.01:
            p_sigs.append('*')
        elif p <= 0.01 and p > 0.001:
            p_sigs.append('**')
        elif p <= 0.001 and p > 0.0001:
            p_sigs.append('***')
        elif p <= 0.0001:
            p_sigs.append('****')
        else:
            print(p)
            p_sigs.append('')

    fig, ax = plt.subplots(figsize=(1.5,1.0))
    c = [grey] * 14
    c[2] = "green"
    c[12] = "green"
    plt.bar(order, odds_ratios, color=c, linewidth=0.5, edgecolor='black')
    rects = ax.patches
    # Make some labels.
    labels = [f'{p_sigs[i]}' for i in range(0, len(rects))]

    for rect, label in zip(rects, labels):
        height = rect.get_height()
        ax.text(rect.get_x() + rect.get_width() / 2, height, label,
                ha='center', va='bottom')
    plt.ylim(0, 7)
    plt.yticks(np.arange(0, 7, 2.0))
    ax.set_title(f'Odds ratio FET for significant dataset {title} compared to all genes ChromHMM annotations')
    ax.set_xlabel('', fontsize=6)
    ax.set_ylabel('Odds ratio', fontsize=6)
    ax.set_xticklabels(order, rotation=90, ha="center")

        
    ax.tick_params(direction='out', length=2, width=0.5)
    ax.spines['bottom'].set_linewidth(0.5)
    ax.spines['top'].set_linewidth(0)
    ax.spines['left'].set_linewidth(0.5)
    ax.spines['right'].set_linewidth(0)
    ax.tick_params(labelsize=6)
    ax.tick_params(axis='x', which='major', pad=0)
    ax.tick_params(axis='y', which='major', pad=0)

    save_fig(f'OddsRatio-ChromHMM-{title}')
    
    plt.show()

tissues = ['forebrain', 'midbrain', 'hindbrain', 'neural-tube', 
           'embryonic-facial-prominence', 'limb',
           'heart', 'liver']
marks = ['H3K9me3', 'H3K36me3', 'H3K27me3', 'H3K27ac', 'H3K4me1', 'H3K4me2', 'H3K4me3','H3K9ac']

def get_median_naninc(df, mark, hist_metric="", time_1="", time_2="", time_3="", tissue="brain"):
    cols = []
    for c in df.columns:
        if tissue in c and mark in c and hist_metric in c and (time_1 in c or time_2 in c or time_3 in c):
            cols.append(c)
    # get nan median
    vals = np.nanmean(df[cols].values, axis=1)
    return vals

    
def plot_mark_heatmap(df, idxs, title, mark_cutoff=3.0):
    mark_df = pd.DataFrame()
    mark_values = []
    mean_col = []
    num_genes = 0
    if idxs is None:
        num_genes = len(df)
    else:
        num_genes = len(idxs) * 1.0
    for t in [['10.5', '11.5', '12.5'], ['15.5', '15.5', '16.5']]:
        tissue_titles = []
        marks_title = []
        for m in marks:
            mark_col = []
            for tissue in tissues:
                median_all_data = get_median_naninc(df, m, "signal", t[0], t[1], t[2], tissue)
                if idxs is None:
                    median_genes = median_all_data
                else:
                    median_genes = median_all_data[idxs]     
                has_mark = 1.0 * len(np.where(median_genes > mark_cutoff)[0])
                if has_mark == 0:
                    mark_col.append(0)
                else:
                    mark_col.append(has_mark/num_genes) #np.nan_to_num(np.nanmean(median_genes)))
                if string.capwords(tissue) not in tissue_titles:
                    if tissue == 'embryonic-facial-prominence':
                        if 'E.F.P' not in tissue_titles:
                            tissue_titles.append('E.F.P')
                    else:
                        tissue_titles.append(string.capwords(tissue))
            marks_title.append(string.capwords(m))
            mark_df[string.capwords(m)] = mark_col

        mark_df['Tissue'] = tissue_titles

        heatmap = Heatmap(mark_df, marks_title, 'Tissue', vmin=0, vmax=1, cmap='Greens', figsize=(2, 2),
                          title=f'{title} {"-".join(t)}', cluster_rows=False, cluster_cols=False)
        heatmap.plot()
        pplot()
        print(mark_df.head())
        save_fig(f'mark_all_tissues_signal-{title}_{"-".join(t)}')
        plt.show()

13) Setup the groups

1) Significnat (p < 0.05 in at least one experiment (anterior time or spatial wt/ko))
    1.a) has H3K27me3 mark (signal at e16.5 > 3.0)
        1.a.i) Expressed (mean (log2(TMM + 1) > 2 in WT OR KO) 
        1.a.j) Not expressed
    1.b) doesn't have H3K27me3 (signal at e16.5 < 3.0)
        1.b.i) Expressed
        1.b.j) Not expressed
2) Not significant 
    2.a) Has (as above)
        2.a.i) Exprerssed
        2.a.j) Not expressed
    2.b) Has not (as above)
        2.b.i) Expressed
        2.b.j) Not expressed
In [68]:
tmm_cutoff = 0.5

df_bg = df_all.copy()  # Make a copy so we don't override the other one

# Also have a look at how the marks plot over genes that are not significant
chrom_dir = os.path.join(input_dir, "supps", "annot")
chromhmm_annot = pd.read_csv(f'{chrom_dir}/e16.5_forebrain_15_segments.csv')
print(len(chromhmm_annot), len(df_bg))
# Make sure we only keep 1 annotation per gene
chromhmm_annot = chromhmm_annot.groupby('external_gene_name').first()
# Merge this with our dataframe
df_bg = df_bg.merge(chromhmm_annot, on='external_gene_name', how='left', suffixes=('', '_chmm'))
# Ensure the new length is the same as the old length
ps = ((1.0 * (df_all['padj_sc'].values < 0.05)) + (1.0 * (df_all['padj_mb'].values < 0.05))
     +  (1.0 * (df_all['padj_hb'].values < 0.05)) + (1.0 * (df_all['padj_fb'].values < 0.05))
     + (1.0 * (df_all['padj_a11'].values < 0.05)) 
     + (1.0 * (df_all['padj_a13'].values < 0.05)) 
     + (1.0 * (df_all['padj_a15'].values < 0.05))
     + (1.0 * (df_all['padj_a18'].values < 0.05)))


df_bg['p_mask'] = ps
df_bg = df_bg.fillna(0)
ko_cols = [c for c in df_bg if 'ko' in c]
wt_cols = [c for c in df_bg if 'wt' in c]

expression_mask = ((1.0 * (np.mean(df_all[wt_cols].values, axis=1) >= tmm_cutoff)) + (1.0 * (np.mean(df_all[ko_cols].values, axis=1) >= tmm_cutoff)))
df_bg['expression_mask'] = expression_mask
df_bg = df_bg[df_bg['expression_mask'] > tmm_cutoff]

cutoff = 1.0
log_str = 'log2FoldChange'
logfc = ((1.0 * (abs(df_bg[f'{log_str}_sc'].values) > cutoff)) 
         + (1.0 * (abs(df_bg[f'{log_str}_mb'].values) > cutoff)) 
         + (1.0 * (abs(df_bg[f'{log_str}_hb'].values) > cutoff)) 
         + (1.0 * (abs(df_bg[f'{log_str}_fb'].values) > cutoff)) 
         + (1.0 * (abs(df_bg[f'{log_str}_a11'].values) > cutoff))
         + (1.0 * (abs(df_bg[f'{log_str}_a13'].values) > cutoff)) 
         + (1.0 * (abs(df_bg[f'{log_str}_a15'].values) > cutoff)) 
         + (1.0 * (abs(df_bg[f'{log_str}_a18'].values) > cutoff))
        )

df_bg['logfc_mask'] = logfc

# Add in the H3K mark --> we'll just look in the brain regions at e16.5
h3k_cols = [c for c in df_bg.columns if 'brain' in c and '16.5' in c and 'signal' in c and 'H3K27me3' in c]
df_bg['hk3_mean'] = np.nanmean(df_bg[h3k_cols].values, axis=1)

ectopic = ((1.0 * (np.mean(df_bg[wt_cols].values, axis=1) <= tmm_cutoff)) + (1.0 * (np.mean(df_bg[ko_cols].values, axis=1) >= tmm_cutoff)))
df_bg['ectopic'] = ectopic

df_sig = df_bg[df_bg['p_mask'] > 0]
df_ns = df_bg[df_bg['p_mask'] == 0]
h3k_cutoff = 0.0

df_sig_hasH3 = df_sig[df_sig['hk3_mean'] > h3k_cutoff]
df_hasH3_exp = df_bg[df_bg['hk3_mean'] > h3k_cutoff]

df_noH3_exp = df_bg[df_bg['hk3_mean'] == 0]
df_noH3_exp = df_noH3_exp[df_noH3_exp['expression_mask'] > tmm_cutoff]

pert_cutoff = 2

df_sig_hasH3_pert = df_sig_hasH3[df_sig_hasH3['logfc_mask'] > pert_cutoff]  # Consistently perturbed
df_sig_hasH3_unpert = df_sig_hasH3[df_sig_hasH3['logfc_mask'] <= pert_cutoff]  # Inconsistently perturbed

df_sig_hasH3_ectopic = df_sig_hasH3[df_sig_hasH3['ectopic'] == pert_cutoff]

df_sig_hasH3_exp = df_sig_hasH3[df_sig_hasH3['expression_mask'] > tmm_cutoff]
df_sig_hasH3_exp_pert = df_sig_hasH3_exp[df_sig_hasH3_exp['logfc_mask'] > pert_cutoff]  # Consistently perturbed
df_sig_hasH3_exp_unpert = df_sig_hasH3_exp[df_sig_hasH3_exp['logfc_mask'] <= pert_cutoff]  # Inonsistently perturbed

df_sig_hasH3_not_exp = df_sig_hasH3[df_sig_hasH3['expression_mask'] <= tmm_cutoff]

df_sig_noH3 = df_sig[df_sig['hk3_mean'] <= h3k_cutoff]
df_sig_noH3_pert = df_sig_noH3[df_sig_noH3['logfc_mask'] > pert_cutoff]  # Consistently perturbed
df_sig_noH3_unpert = df_sig_noH3[df_sig_noH3['logfc_mask'] <= pert_cutoff]  # Inonsistently perturbed

df_sig_noH3_exp = df_sig_noH3[df_sig_noH3['expression_mask'] > tmm_cutoff]
# #df_bg = df_bg[df_bg['logfc_mask'] > 2] 
df_sig_noH3_exp_pert = df_sig_noH3_exp[df_sig_noH3_exp['logfc_mask'] > pert_cutoff]
df_sig_noH3_exp_unpert = df_sig_noH3_exp[df_sig_noH3_exp['logfc_mask'] <= pert_cutoff]


df_sig_noH3_ectopic = df_sig_noH3[df_sig_noH3['ectopic'] == 2]

df_sig_noH3_notexp = df_sig_noH3[df_sig_noH3['expression_mask'] <= tmm_cutoff]


df_ns_hasH3 = df_ns[df_ns['hk3_mean'] > h3k_cutoff]

df_ns_hasH3_exp = df_ns_hasH3[df_ns_hasH3['expression_mask'] > tmm_cutoff]

df_ns_hasH3_not_exp = df_ns_hasH3[df_ns_hasH3['expression_mask'] <= tmm_cutoff]
df_ns_noH3 = df_ns[df_ns['hk3_mean'] <= h3k_cutoff]
df_ns_noH3_exp = df_ns_noH3[df_ns_noH3['expression_mask'] > tmm_cutoff]
df_ns_noH3_notexp = df_ns_noH3[df_ns_noH3['expression_mask'] <= tmm_cutoff]

df_ns_noH3_ectopic = df_ns_noH3[df_ns_noH3['ectopic'] == 2]
df_ns_H3_ectopic = df_ns_hasH3[df_ns_hasH3['ectopic'] == 2]

dfs = [df_bg, df_sig, df_ns, df_sig_hasH3, df_ns_hasH3, df_sig_noH3, df_ns_noH3,
       df_sig_hasH3_pert, df_sig_hasH3_unpert, df_sig_noH3_pert, df_sig_noH3_unpert,
       df_sig_hasH3_exp_pert, df_sig_hasH3_exp_unpert, df_sig_noH3_exp_pert, df_sig_noH3_exp_unpert,
       df_sig_hasH3_exp, df_sig_hasH3_not_exp, df_sig_noH3_exp, df_sig_noH3_notexp, 
      df_ns_hasH3_exp, df_ns_hasH3_not_exp, df_ns_noH3_exp, df_ns_noH3_notexp,
      df_sig_hasH3_ectopic, df_sig_noH3_ectopic]
df_dict = {'All genes': df_bg, 
           'Genes with H3K27me3': df_hasH3_exp,
           'Genes without H3K27me3': df_noH3_exp,
             'Genes with at least 1 sig': df_sig, 
             'Non-sig genes': df_ns, 
             'Sig with H3K27me3': df_sig_hasH3, 
             'NS with H3K27me3': df_ns_hasH3,
             'Sig unmarked': df_sig_noH3, 
             'NS unmarked': df_ns_noH3,
             'Sig Marked Perturbed': df_sig_hasH3_pert,  
             'Sig Marked un-Perturbed': df_sig_hasH3_unpert, 
             'Sig Unmarked Perturbed': df_sig_noH3_pert,  
             'Sig Unmarked un-Perturbed': df_sig_noH3_unpert,
             'Exp. Sig Marked Perturbed': df_sig_hasH3_exp_pert, 
             'Exp. Sig Marked un-Perturbed': df_sig_hasH3_exp_unpert, 
             'Exp. Sig Unmarked Perturbed': df_sig_noH3_exp_pert,  
             'Exp. Sig Unmarked un-Perturbed': df_sig_noH3_exp_unpert,
            'Sig genes with H3K27me3 and expression': df_sig_hasH3_exp, 
             'Sig genes with H3K27me3 NO expression': df_sig_hasH3_not_exp, 
            'Sig genes unmarked and expression': df_sig_noH3_exp, 
             'Sig genes unmarked NO expression': df_sig_noH3_notexp,
            'NS genes with H3K27me3 and expression': df_ns_hasH3_exp,
             'NS genes with H3K27me3 NO expression': df_ns_hasH3_not_exp, 
            'NS genes unmarked and expression': df_ns_noH3_exp,
             'NS genes unmarked NO expression': df_ns_noH3_notexp, 
            'Sig ectopic marked': df_sig_hasH3_ectopic, 
             'Sig ectopic unmarked': df_sig_noH3_ectopic, 
            'NS ectopic marked': df_ns_H3_ectopic,
            'NS ectopic unmarked': df_ns_noH3_ectopic,
          }


emg = [g for g in df_dict['Sig ectopic marked'][gene_name].values]
print(", ".join(emg))
emg = [g for g in df_dict['Sig ectopic unmarked'][gene_name].values]
print(", ".join(emg))
53254 20900
Tcf24, Cryba2, Ihh, Fcrlb, Tcf21, Nodal, Npffr1, Fstl3, Gipc3, Hand1, Alox12b, Hnf1b, Gcgr, Tc2n, Gsc, Lbhd2, Prss16, Susd3, Dmgdh, Ltb4r2, Gja3, Gata4, Fam83f, Ttll8, Wnt10b, Gsc2, Ildr1, Prss41, Hs3st6, Gng13, Nkx2-5, Mpig6b, Lta, Abcg8, Dmrt1, Acbd7, Spag6, Lrrc26, Cutal, Rspo4, Rem1, Fgf2, Hapln2, Rhbg, Slc44a3, Foxe1, Cdkn2a, Kdf1, Lrrc38, Gabrd, Cwh43, Nmu, Cfap299, Gfi1, Tbx5, Hoxa13, Vax2, Zfp541, Ppm1n, Phldb3, Lypd3, Nccrp1, Slc6a16, AC151602.1, Ano9, Ascl2, Rab20, Htra4, Adrb3, Hand2, Comp, Ttc29, Il15, Mlkl, 4833427G06Rik, C2cd4a, Ankrd34c, Prss50, Ccr9
Gsta3, Rfx8, Aox3, Mdh1b, Mogat1, Sp100, Ugt1a7c, Ugt1a6a, Iqca, Cfap221, Lax1, Fcgr4, Cd48, Vsig8, Ccdc170, Vip, Zc3h12d, Lilrb4a, Oit3, Aire, Lif, Ifi47, Slc36a2, Nmur2, Slc35g3, Mgl2, Clec10a, Tm4sf5, Ccl2, Ccl12, Slfn8, Krt26, Asb16, Aanat, Card14, Cbr2, Cdhr3, Efcab10, Acot4, Batf, Gpr65, Ifi27l2a, Serpina3g, Omd, Fam81b, Bhmt, Il31ra, Dhrs2, Arl11, Rubcnl, Epsti1, Slc45a2, Gpr20, Lypd2, Meltf, Cd80, Cd200r1, Btla, Wdr27, Tff3, H2-DMb1, H2-Eb2, C4b, Ly6g6d, H2-T22, Ankrd66, Guca1a, Plin4, Psma8, Zfp474, Iigp1, Ifit3b, Ifit1, Spaca9, Morn5, Spo11, Tnfsf10, Sptssb, Chia1, Ubl4b, 1700013F07Rik, Bank1, Cyp4b1, Rhbdl2, Pla2g5, Ccdc27, Slc26a5, Tlr6, Rhoh, Ppef2, Fam47e, Gbp11, Selplg, Cfap73, Lrrc43, Ccl24, Muc3a, Card11, Clec5a, Reg3b, Cacna2d4, Clec4a3, Clec4n, C1rl, C1ra, Fgf23, BC035044, Lmntd1, Apoc1, Capn12, Sec1, Isg20, Wdr93, Cfap161, Olfr558, Trim34b, F10, Gdf15, Abcc12, Dnaaf1, Kcng4, Casp12, 1700012B09Rik, Bcl2a1b, Gadl1, Hhatl, Ccr2, Akap14, Xlr, Dmrtc1a, Cysltr1, Tex16, Lhfpl1, Ace2

Plot correlation between FB logFC and median H3K27me3 mark

In [51]:
%matplotlib inline
from scipy.stats import spearmanr
from sciviso import Histogram, Scatterplot
from numpy.polynomial.polynomial import polyfit
import statsmodels.api as sm

# Do this on the consistently affected dataset
df_const = pd.read_csv(f'{output_dir}df-consistent_epi-2500_{date}.csv')


#calculate Spearman Rank correlation and corre
# Also do the main markers
fb_genes = ['Emx1', 'Eomes', 'Tbr1', 'Foxg1', 'Lhx6']
mb_genes = ['En1', 'En2', 'Lmx1a', 'Bhlhe23', 'Sall4']
hb_genes = ['Phox2b', 'Krox20', 'Fev', 'Hoxb1',  'Hoxd3']
sc_genes = ['Hoxd8', 'Hoxd9', 'Hoxd10', 'Hoxd11','Hoxa7', 'Hoxa9', 'Hoxa10',
              'Hoxb9', 'Hoxb13',  'Hoxc8', 'Hoxc9', 'Hoxc10', 'Hoxc11', 'Hoxc12', 'Hoxc13']

# Fill NAs in genes that didn't have the H3K27me3 mark
df_const_na = df_const.fillna(0)
h3k_cols = [c for c in df_const_na.columns if 'brain' in c and 'H3K27me3' in c and 'signal' in c]
Y = np.nan_to_num(np.nanmedian(df_const_na[h3k_cols].values, axis=1))
df_const_na['Median_H3K'] = Y
X = df_const_na['log2FoldChange_fb'].values
results = sm.OLS(Y,sm.add_constant(X)).fit()
print(results.summary())
corr, _ = spearmanr(X, Y)
u.dp(["H3K27me3 median signal in brain regions vs FB logFC:\n", 'Spearmans correlation: %.3f' % corr])

# now run the scatter
s = Scatterplot(df_const_na,'Median_H3K', 'log2FoldChange_fb', title='H3K27me3 vs FB logFC', colour="green", 
                points_to_annotate=['Foxg1', 'Otx2', 'Pax2', 'Hoxd9'], 
                annotation_label=gene_name, add_legend=False)
s.opacity=0.1
ax = s.plot()
ax.set_xlim(-3, 20)
ax.set_ylim(-5, 12)

save_fig(f'Scatter_H3K27me3vsLogFC')
plt.show()


"""
--------------------------------------------------------
Do the same with H3K27ac as a control
--------------------------------------------------------
"""
h3k_cols = [c for c in df_const_na.columns if 'brain' in c and 'H3K27ac' in c and 'signal' in c]
Y = np.nan_to_num(np.nanmedian(df_const_na[h3k_cols].values, axis=1))
df_const_na['Median_H3K'] = Y
X = df_const_na['log2FoldChange_fb'].values
results = sm.OLS(Y,sm.add_constant(X)).fit()
print(results.summary())
corr, _ = spearmanr(X, Y)
u.dp(["H3K27ac median signal in brain regions vs FB logFC:\n", 'Spearmans correlation: %.3f' % corr])

# now run the scatter
s = Scatterplot(df_const_na,'Median_H3K', 'log2FoldChange_fb', title='H3K27ac vs FB logFC', colour="grey", 
                points_to_annotate=['Foxg1', 'Otx2', 'Pax2', 'Hoxd9'], 
                annotation_label=gene_name, add_legend=False)
s.opacity=0.1
ax = s.plot()
ax.set_xlim(-3, 20)
ax.set_ylim(-5, 12)

save_fig(f'Scatter_H3K27acvsLogFC')

    
    
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/IPython/core/interactiveshell.py:3062: DtypeWarning: Columns (4) have mixed types.Specify dtype option on import or set low_memory=False.
  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.185
Model:                            OLS   Adj. R-squared:                  0.185
Method:                 Least Squares   F-statistic:                     311.2
Date:                Tue, 15 Jun 2021   Prob (F-statistic):           6.46e-63
Time:                        14:08:01   Log-Likelihood:                -3348.4
No. Observations:                1371   AIC:                             6701.
Df Residuals:                    1369   BIC:                             6711.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          2.2022      0.090     24.599      0.000       2.027       2.378
x1             0.6971      0.040     17.640      0.000       0.620       0.775
==============================================================================
Omnibus:                       70.993   Durbin-Watson:                   1.762
Prob(Omnibus):                  0.000   Jarque-Bera (JB):               76.023
Skew:                           0.550   Prob(JB):                     3.10e-17
Kurtosis:                       2.655   Cond. No.                         2.87
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
--------------------------------------------------------------------------------
H3K27me3 median signal in brain regions vs FB logFC:
	Spearmans correlation: 0.357	
--------------------------------------------------------------------------------
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.073
Model:                            OLS   Adj. R-squared:                  0.072
Method:                 Least Squares   F-statistic:                     107.9
Date:                Tue, 15 Jun 2021   Prob (F-statistic):           2.20e-24
Time:                        14:08:01   Log-Likelihood:                -3945.9
No. Observations:                1371   AIC:                             7896.
Df Residuals:                    1369   BIC:                             7906.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          2.4346      0.138     17.588      0.000       2.163       2.706
x1            -0.6347      0.061    -10.388      0.000      -0.755      -0.515
==============================================================================
Omnibus:                     1181.584   Durbin-Watson:                   1.815
Prob(Omnibus):                  0.000   Jarque-Bera (JB):            28921.064
Skew:                           4.040   Prob(JB):                         0.00
Kurtosis:                      24.000   Cond. No.                         2.87
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
--------------------------------------------------------------------------------
H3K27ac median signal in brain regions vs FB logFC:
	Spearmans correlation: -0.242	
--------------------------------------------------------------------------------
In [52]:
h3k_cols = [c for c in df_bg.columns if 'brain' in c and '10.5' in c and 'signal' in c and 'H3K27me3' in c]
df_bg['hk3_mean'] = np.nanmean(df_bg[h3k_cols].values, axis=1)
e10 = df_bg[df_bg['hk3_mean'] > h3k_cutoff][gene_name].values

h3k_cols = [c for c in df_bg.columns if 'brain' in c and '11.5' in c and 'signal' in c and 'H3K27me3' in c]
df_bg['hk3_mean'] = np.nanmean(df_bg[h3k_cols].values, axis=1)
e11 = df_bg[df_bg['hk3_mean'] > h3k_cutoff][gene_name].values

h3k_cols = [c for c in df_bg.columns if 'brain' in c and '12.5' in c and 'signal' in c and 'H3K27me3' in c]
df_bg['hk3_mean'] = np.nanmean(df_bg[h3k_cols].values, axis=1)
e12 = df_bg[df_bg['hk3_mean'] > h3k_cutoff][gene_name].values

h3k_cols = [c for c in df_bg.columns if 'brain' in c and '13.5' in c and 'signal' in c and 'H3K27me3' in c]
df_bg['hk3_mean'] = np.nanmean(df_bg[h3k_cols].values, axis=1)
e13 = df_bg[df_bg['hk3_mean'] > h3k_cutoff][gene_name].values

h3k_cols = [c for c in df_bg.columns if 'brain' in c and '14.5' in c and 'signal' in c and 'H3K27me3' in c]
df_bg['hk3_mean'] = np.nanmean(df_bg[h3k_cols].values, axis=1)
e14 = df_bg[df_bg['hk3_mean'] > h3k_cutoff][gene_name].values

h3k_cols = [c for c in df_bg.columns if 'brain' in c and '15.5' in c and 'signal' in c and 'H3K27me3' in c]
df_bg['hk3_mean'] = np.nanmean(df_bg[h3k_cols].values, axis=1)
e15 = df_bg[df_bg['hk3_mean'] > h3k_cutoff][gene_name].values

h3k_cols = [c for c in df_bg.columns if 'brain' in c and '16.5' in c and 'signal' in c and 'H3K27me3' in c]
df_bg['hk3_mean'] = np.nanmean(df_bg[h3k_cols].values, axis=1)
e16 = df_bg[df_bg['hk3_mean'] > h3k_cutoff][gene_name].values

### ----------- 
plt.plot([10.5, 11.5, 12.5, 13.5, 14.5, 15.5, 16.5], 
         [len(e10), len(e11), len(e12), len(e13), len(e14), len(e15), len(e16)]
        )
afct = set(df_sig[gene_name].values)
cafct = set(df_sig[df_sig['logfc_mask'] > pert_cutoff][gene_name].values)

plt.plot([10.5, 11.5, 12.5, 13.5, 14.5, 15.5, 16.5], 
         [len(set(afct) & set(e10)), len(set(afct) & set(e11)), 
          len(set(afct) & set(e12)), len(set(afct) & set(e13)), len(set(afct) & set(e14)), 
          len(set(afct) & set(e15)), len(set(afct) & set(e16))]
        )

plt.plot([10.5, 11.5, 12.5, 13.5, 14.5, 15.5, 16.5], 
         [len(set(cafct) & set(e10)), len(set(cafct) & set(e11)), 
          len(set(cafct) & set(e12)), len(set(cafct) & set(e13)), len(set(cafct) & set(e14)), 
          len(set(cafct) & set(e15)), len(set(cafct) & set(e16))]
        )

plt.show()

# ==============================
plt.plot([10.5,  12.5, 13.5, 14.5, 15.5, 16.5], 
         [len(e10), len(e12), len(e13), len(e14), len(e15), len(e16)], label="marked"
        )
plt.plot([10.5,  12.5, 13.5, 14.5, 15.5, 16.5], 
         [len(df_bg) - len(e10), len(df_bg) - len(e12), len(df_bg) - len(e13), len(df_bg) - len(e14), len(df_bg) - len(e15), len(df_bg) - len(e16)]
        , label="bg - marked")
afct = set(df_sig[gene_name].values)
cafct = set(df_sig[df_sig['logfc_mask'] > pert_cutoff][gene_name].values)
afct = set(df_sig[gene_name].values)

plt.plot([10.5, 12.5, 13.5, 14.5, 15.5, 16.5], 
         [len(set(afct) & set(e10)), 
          len(set(afct) & set(e12)), len(set(afct) & set(e13)), len(set(afct) & set(e14)), 
          len(set(afct) & set(e15)), len(set(afct) & set(e16))], label="Intersect, affect and marked"
        )
plt.plot([10.5, 12.5, 13.5, 14.5, 15.5, 16.5], 
         [len(set(afct)) - len(set(afct) & set(e10)), 
          len(set(afct)) - len(set(afct) & set(e12)), len(set(afct)) - len(set(afct) & set(e13)),
          len(set(afct)) - len(set(afct) & set(e14)), 
          len(set(afct)) - len(set(afct) & set(e15)), len(set(afct)) - len(set(afct) & set(e16))],
         label="unmarked affected"
        )
plt.plot([10.5, 12.5, 13.5, 14.5, 15.5, 16.5], 
         [len(set(cafct) & set(e10)), 
          len(set(cafct) & set(e12)), len(set(cafct) & set(e13)), len(set(cafct) & set(e14)), 
          len(set(cafct) & set(e15)), len(set(cafct) & set(e16))], 
         label="consist. affected, marked"
        )
plt.legend(loc="upper left")
plt.show()


#------------ 
h3k_cols = [c for c in df_bg.columns if 'brain' in c and 'signal' in c and 'H3K27me3' in c and '16.5' in c]
df_bg['hk3_mean'] = np.nanmean(df_bg[h3k_cols].values, axis=1)
plot_venn((set(df_bg[df_bg['hk3_mean'] == 0][gene_name].values), 
           set(df_sig[gene_name].values), set(df_sig[df_sig['logfc_mask'] > pert_cutoff][gene_name].values)),
          ('Un-Marked genes', 'Affected Genes', 'Const. Affected Genes'), 'venn_unmaked_marked_const_aff')

#------------ 
h3k_cols = [c for c in df_bg.columns if 'brain' in c and 'signal' in c and 'H3K27me3' in c and '10.5' in c]
df_bg['hk3_mean'] = np.nanmean(df_bg[h3k_cols].values, axis=1)
plot_venn((set(df_bg[df_bg['hk3_mean'] > h3k_cutoff][gene_name].values), 
           set(df_sig[gene_name].values), set(df_sig[df_sig['logfc_mask'] > pert_cutoff][gene_name].values)),
          ('Marked genes', 'Affected Genes', 'Const. Affected Genes'), 'venn_marked_const_aff')
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/matplotlib_venn/_venn3.py:117: UserWarning: Bad circle positioning
  warnings.warn("Bad circle positioning")
In [53]:
# Print out the size of each DF
for label, d in df_dict.items():
    u.dp([label, len(d)])
--------------------------------------------------------------------------------
                                All genes	14968	                                
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                           Genes with H3K27me3	2596	                            
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Genes without H3K27me3	12372	                          
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                        Genes with at least 1 sig	12797	                        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Non-sig genes	2171	                               
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                            Sig with H3K27me3	2416	                             
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                             NS with H3K27me3	180	                              
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              Sig unmarked	10381	                               
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                               NS unmarked	1991	                                
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                           Sig Marked Perturbed	720	                            
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Sig Marked un-Perturbed	1696	                          
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                          Sig Unmarked Perturbed	651	                           
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                        Sig Unmarked un-Perturbed	9730	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                         Exp. Sig Marked Perturbed	720	                         
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                       Exp. Sig Marked un-Perturbed	1696	                       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                        Exp. Sig Unmarked Perturbed	651	                        
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                      Exp. Sig Unmarked un-Perturbed	9730	                      
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                  Sig genes with H3K27me3 and expression	2416	                  
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                    Sig genes with H3K27me3 NO expression	0	                    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                    Sig genes unmarked and expression	10381	                    
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                      Sig genes unmarked NO expression	0	                       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                   NS genes with H3K27me3 and expression	180	                   
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                    NS genes with H3K27me3 NO expression	0	                     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                     NS genes unmarked and expression	1991	                     
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                       NS genes unmarked NO expression	0	                       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                             Sig ectopic marked	79	                             
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                           Sig ectopic unmarked	134	                            
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                              NS ectopic marked	5	                              
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
                            NS ectopic unmarked	25	                             
--------------------------------------------------------------------------------

13) Test for enrichment of specific marks using the Gorkin et al annotations

In [69]:
# Save each DF so we can test each of them for ORA (just save the gene name to avoid dups.)
for label, d in df_dict.items():
    print(label)
    dn = d[gene_id]
    dn.to_csv(os.path.join(output_dir, f'{label.replace(" ", "_")}_gene-name_{date}.csv'), index=False)
All genes
Genes with H3K27me3
Genes without H3K27me3
Genes with at least 1 sig
Non-sig genes
Sig with H3K27me3
NS with H3K27me3
Sig unmarked
NS unmarked
Sig Marked Perturbed
Sig Marked un-Perturbed
Sig Unmarked Perturbed
Sig Unmarked un-Perturbed
Exp. Sig Marked Perturbed
Exp. Sig Marked un-Perturbed
Exp. Sig Unmarked Perturbed
Exp. Sig Unmarked un-Perturbed
Sig genes with H3K27me3 and expression
Sig genes with H3K27me3 NO expression
Sig genes unmarked and expression
Sig genes unmarked NO expression
NS genes with H3K27me3 and expression
NS genes with H3K27me3 NO expression
NS genes unmarked and expression
NS genes unmarked NO expression
Sig ectopic marked
Sig ectopic unmarked
NS ectopic marked
NS ectopic unmarked
In [56]:
"""
--------------------------------------------------------
Test for enriched gorkin et al states
--------------------------------------------------------
"""
for label, d in df_dict.items():
    run_annot_plot(df_bg, d, title=f'Annot {label}')
Pr-A 1.0 1.0 [[5885, 5885], [9083.0, 9083.0]]
Pr-W 1.0 1.0 [[3464, 3464], [11504.0, 11504.0]]
Pr-B 1.0 1.0 [[806, 806], [14162.0, 14162.0]]
Pr-F 1.0 1.0 [[1042, 1042], [13926.0, 13926.0]]
En-Sd 1.0 1.0 [[59, 59], [14909.0, 14909.0]]
En-Sp 1.0 1.0 [[348, 348], [14620.0, 14620.0]]
En-W 1.0 1.0 [[23, 23], [14945.0, 14945.0]]
En-Pd 1.0 1.0 [[155, 155], [14813.0, 14813.0]]
En-Pp 1.0 1.0 [[352, 352], [14616.0, 14616.0]]
Tr-S 1.0 1.0 [[51, 51], [14917.0, 14917.0]]
Tr-P 1.0 1.0 [[997, 997], [13971.0, 13971.0]]
Tr-I 1.0 1.0 [[114, 114], [14854.0, 14854.0]]
Hc-P 1.0 1.0 [[311, 311], [14657.0, 14657.0]]
Hc-H 1.0 1.0 [[54, 54], [14914.0, 14914.0]]
NS 1.0 1.0 [[747, 747], [14221.0, 14221.0]]
<ipython-input-45-c5d5c405855c>:77: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_xticklabels(order, rotation=90, ha="center")
Pr-A 0.24210438635947157 1.5288251008234748e-160 [[352, 5885], [2244.0, 9083.0]]
Pr-W 0.7707531586671943 8.82178889259204e-07 [[489, 3464], [2107.0, 11504.0]]
Pr-B 7.658034871209293 3.869481077353874e-268 [[788, 806], [1808.0, 14162.0]]
Pr-F 2.2988462021724154 8.564505369731715e-35 [[381, 1042], [2215.0, 13926.0]]
En-Sd 0.487639743832955 0.15536536696834718 [[5, 59], [2591.0, 14909.0]]
En-Sp 0.8755836395604792 0.39383457522404997 [[53, 348], [2543.0, 14620.0]]
En-W 0.0 0.03909597693235345 [[0, 23], [2596.0, 14945.0]]
En-Pd 0.48098360246275274 0.008348898548592993 [[13, 155], [2583.0, 14813.0]]
En-Pp 0.5674718682332895 0.0008329972164518171 [[35, 352], [2561.0, 14616.0]]
Tr-S 0.0 0.0005057624163226845 [[0, 51], [2596.0, 14917.0]]
Tr-P 0.3485280159467744 3.1862659558307444e-20 [[63, 997], [2533.0, 13971.0]]
Tr-I 1.3694249247095942 0.1519492557862803 [[27, 114], [2569.0, 14854.0]]
Hc-P 5.949860153029553 1.66421426639638e-88 [[291, 311], [2305.0, 14657.0]]
Hc-H 0.3195355015640399 0.039381665544334435 [[3, 54], [2593.0, 14914.0]]
NS 0.08101056709105928 2.2458503187919558e-38 [[11, 747], [2585.0, 14221.0]]
Pr-A 1.2486793035545176 2.1059896133398323e-19 [[5533, 5885], [6839.0, 9083.0]]
Pr-W 1.051401840447826 0.0806854086676437 [[2975, 3464], [9397.0, 11504.0]]
Pr-B 0.025600854205406996 4.861687845129709e-185 [[18, 806], [12354.0, 14162.0]]
Pr-F 0.754338285559568 3.0963394272441627e-08 [[661, 1042], [11711.0, 13926.0]]
En-Sd 1.1077711823127792 0.6361104147042314 [[54, 59], [12318.0, 14909.0]]
En-Sp 1.0261977978469572 0.7485960286860522 [[295, 348], [12077.0, 14620.0]]
En-W 1.2102194509676898 0.5549686692086748 [[23, 23], [12349.0, 14945.0]]
En-Pd 1.1096172816711947 0.37979319883722756 [[142, 155], [12230.0, 14813.0]]
En-Pp 1.0918875608008747 0.27094418554334565 [[317, 352], [12055.0, 14616.0]]
Tr-S 1.2106971836701566 0.37000577474536944 [[51, 51], [12321.0, 14917.0]]
Tr-P 1.1442715977974138 0.004430077722762405 [[934, 997], [11438.0, 13971.0]]
Tr-I 0.9227470385365122 0.6187431886899627 [[87, 114], [12285.0, 14854.0]]
Hc-P 0.07630928977225397 1.2444808863936277e-57 [[20, 311], [12352.0, 14657.0]]
Hc-H 1.1432062693323954 0.4937809858486971 [[51, 54], [12321.0, 14914.0]]
NS 1.204158446551187 0.0005364689695315265 [[736, 747], [11636.0, 14221.0]]
Pr-A 0.9969173576070832 0.9019063953449773 [[5022, 5885], [7775.0, 9083.0]]
Pr-W 0.9944838215236491 0.8526973591024303 [[2949, 3464], [9848.0, 11504.0]]
Pr-B 1.1093916173683664 0.04734402215783287 [[760, 806], [12037.0, 14162.0]]
Pr-F 1.0692648974321635 0.15456572825426157 [[948, 1042], [11849.0, 13926.0]]
En-Sd 1.0509126262142636 0.8494742316131091 [[53, 59], [12744.0, 14909.0]]
En-Sp 1.0395220791269963 0.6347160636240619 [[309, 348], [12488.0, 14620.0]]
En-W 0.9152583892731622 0.8757941656140028 [[18, 23], [12779.0, 14945.0]]
En-Pd 1.0189263276962819 0.9057666847002287 [[135, 155], [12662.0, 14813.0]]
En-Pp 0.9899810166631512 0.9050005166446206 [[298, 352], [12499.0, 14616.0]]
Tr-S 0.7791793987829403 0.27714150921013736 [[34, 51], [12763.0, 14917.0]]
Tr-P 0.9021119751169002 0.03850739040487501 [[774, 997], [12023.0, 13971.0]]
Tr-I 0.9848540728247672 0.9446602076932347 [[96, 114], [12701.0, 14854.0]]
Hc-P 1.0619472709918516 0.4791900765574865 [[282, 311], [12515.0, 14657.0]]
Hc-H 0.8877071646748661 0.6069421257054404 [[41, 54], [12756.0, 14914.0]]
NS 0.949609469683366 0.3712834356803585 [[608, 747], [12189.0, 14221.0]]
Pr-A 1.0183238108600365 0.7069008481448309 [[863, 5885], [1308.0, 9083.0]]
Pr-W 1.0328039406009082 0.5496795248633665 [[515, 3464], [1656.0, 11504.0]]
Pr-B 0.3803544008173989 8.632534223299366e-13 [[46, 806], [2125.0, 14162.0]]
Pr-F 0.604853264480643 1.4829776816794924e-06 [[94, 1042], [2077.0, 13926.0]]
En-Sd 0.7003092339609347 0.5732601558566706 [[6, 59], [2165.0, 14909.0]]
En-Sp 0.7685029436501262 0.14131486499781698 [[39, 348], [2132.0, 14620.0]]
En-W 1.4999598538680798 0.3908471961098976 [[5, 23], [2166.0, 14945.0]]
En-Pd 0.8885889533750244 0.7315854759344855 [[20, 155], [2151.0, 14813.0]]
En-Pp 1.0591531755915318 0.705578680138238 [[54, 352], [2117.0, 14616.0]]
Tr-S 2.3084184463014545 0.005125337648515433 [[17, 51], [2154.0, 14917.0]]
Tr-P 1.6041620755490291 5.564152351846437e-09 [[223, 997], [1948.0, 13971.0]]
Tr-I 1.0893490111716821 0.6941403652520004 [[18, 114], [2153.0, 14854.0]]
Hc-P 0.6380625133225852 0.020732031269808812 [[29, 311], [2142.0, 14657.0]]
Hc-H 1.6637661758143685 0.09860321512893305 [[13, 54], [2158.0, 14914.0]]
NS 1.3022687864318165 0.0069237899343419325 [[139, 747], [2032.0, 14221.0]]
Pr-A 0.23818595057928663 2.236952445294526e-153 [[323, 5885], [2093.0, 9083.0]]
Pr-W 0.770557040111269 2.0066880841137127e-06 [[455, 3464], [1961.0, 11504.0]]
Pr-B 7.788216215895717 2.0042205275118214e-259 [[742, 806], [1674.0, 14162.0]]
Pr-F 2.363068698678868 2.673743261926909e-35 [[363, 1042], [2053.0, 13926.0]]
En-Sd 0.41906287770188605 0.09873307057889252 [[4, 59], [2412.0, 14909.0]]
En-Sp 0.9241107026858821 0.6608429851608619 [[52, 348], [2364.0, 14620.0]]
En-W 0.0 0.06396153528908419 [[0, 23], [2416.0, 14945.0]]
En-Pd 0.39720590995629207 0.002089334647878005 [[10, 155], [2406.0, 14813.0]]
En-Pp 0.5750104909777591 0.0016823232139883632 [[33, 352], [2383.0, 14616.0]]
Tr-S 0.0 0.0008198459150934424 [[0, 51], [2416.0, 14917.0]]
Tr-P 0.3143000733049763 1.8402472560241576e-21 [[53, 997], [2363.0, 13971.0]]
Tr-I 1.3073402569970076 0.26468037308163117 [[24, 114], [2392.0, 14854.0]]
Hc-P 5.830796380767217 3.605088624399581e-81 [[266, 311], [2150.0, 14657.0]]
Hc-H 0.3433715522401805 0.05614670695258997 [[3, 54], [2413.0, 14914.0]]
NS 0.07118294532513102 2.5705566344327756e-37 [[9, 747], [2407.0, 14221.0]]
Pr-A 0.29641753925965103 2.551605400964937e-11 [[29, 5885], [151.0, 9083.0]]
Pr-W 0.7733873263943813 0.21244279261099747 [[34, 3464], [146.0, 11504.0]]
Pr-B 6.031739565201288 1.3467043737187781e-18 [[46, 806], [134.0, 14162.0]]
Pr-F 1.4849648112603966 0.13860694123361939 [[18, 1042], [162.0, 13926.0]]
En-Sd 1.41170343717451 0.512587328383429 [[1, 59], [179.0, 14909.0]]
En-Sp 0.23470108521158414 0.13485578173566978 [[1, 348], [179.0, 14620.0]]
En-W 0.0 1.0 [[0, 23], [180.0, 14945.0]]
En-Pd 1.619792236194642 0.43957989016331256 [[3, 155], [177.0, 14813.0]]
En-Pp 0.46654749744637386 0.4497847359448516 [[2, 352], [178.0, 14616.0]]
Tr-S 0.0 1.0 [[0, 51], [180.0, 14917.0]]
Tr-P 0.824296418667768 0.6527148928406563 [[10, 997], [170.0, 13971.0]]
Tr-I 2.208444840915849 0.16269180905899303 [[3, 114], [177.0, 14854.0]]
Hc-P 7.601389897313557 2.2132966867186292e-13 [[25, 311], [155.0, 14657.0]]
Hc-H 0.0 1.0 [[0, 54], [180.0, 14914.0]]
NS 0.2139043063640329 0.013627939739904006 [[2, 747], [178.0, 14221.0]]
Pr-A 1.2764007850814194 4.0755655433711264e-21 [[4699, 5885], [5682.0, 9083.0]]
Pr-W 1.0501603041342333 0.10384414174317344 [[2494, 3464], [7887.0, 11504.0]]
Pr-B 0.030519439626903216 3.948499231679287e-159 [[18, 806], [10363.0, 14162.0]]
Pr-F 0.7981155299393619 2.1350211873678833e-05 [[585, 1042], [9796.0, 13926.0]]
En-Sd 1.198417619769418 0.377805338698898 [[49, 59], [10332.0, 14909.0]]
En-Sp 1.0664711599158936 0.45144754610565646 [[257, 348], [10124.0, 14620.0]]
En-W 1.1286390964510025 0.7514789076036937 [[18, 23], [10363.0, 14945.0]]
En-Pd 1.1647784459765487 0.2216479384996152 [[125, 155], [10256.0, 14813.0]]
En-Pp 1.0877345519249433 0.3199448127236231 [[265, 352], [10116.0, 14616.0]]
Tr-S 0.9611159434296576 0.9122768135132235 [[34, 51], [10347.0, 14917.0]]
Tr-P 1.045900745715407 0.3876751339707447 [[721, 997], [9660.0, 13971.0]]
Tr-I 0.91002751811141 0.5502457629914534 [[72, 114], [10309.0, 14854.0]]
Hc-P 0.07275039824539362 1.2754377804783065e-51 [[16, 311], [10365.0, 14657.0]]
Hc-H 1.0146995104937675 1.0 [[38, 54], [10343.0, 14914.0]]
NS 1.1657587892632344 0.0068058200909345435 [[599, 747], [9782.0, 14221.0]]
Pr-A 1.1125397546903375 0.02818185763382116 [[834, 5885], [1157.0, 9083.0]]
Pr-W 1.057886606610281 0.32308910897636933 [[481, 3464], [1510.0, 11504.0]]
Pr-B 0.0 2.476691958845903e-45 [[0, 806], [1991.0, 14162.0]]
Pr-F 0.5303999639175516 2.097775509860076e-08 [[76, 1042], [1915.0, 13926.0]]
En-Sd 0.6361906224930445 0.4361429485871955 [[5, 59], [1986.0, 14909.0]]
En-Sp 0.8174279475725527 0.2631359298283783 [[38, 348], [1953.0, 14620.0]]
En-W 1.6359078768772712 0.36925019693946404 [[5, 23], [1986.0, 14945.0]]
En-Pd 0.8230251331829918 0.5509370900222303 [[17, 155], [1974.0, 14813.0]]
En-Pp 1.1135543157203807 0.4810843988915803 [[52, 352], [1939.0, 14616.0]]
Tr-S 2.5189125295508275 0.002009889597174559 [[17, 51], [1974.0, 14917.0]]
Tr-P 1.678727408321703 5.721312892154466e-10 [[213, 997], [1778.0, 13971.0]]
Tr-I 0.989106115491157 1.0 [[15, 114], [1976.0, 14854.0]]
Hc-P 0.09487391517532773 1.3715972330113823e-12 [[4, 311], [1987.0, 14657.0]]
Hc-H 1.8151705800846347 0.057215537490759756 [[13, 54], [1978.0, 14914.0]]
NS 1.406761169092046 0.0005787500204376309 [[137, 747], [1854.0, 14221.0]]
Pr-A 0.0509304959109864 6.0770285843974434e-114 [[23, 5885], [697.0, 9083.0]]
Pr-W 0.2746705099932279 3.524615789357567e-27 [[55, 3464], [665.0, 11504.0]]
Pr-B 15.46039818852082 5.623857672765395e-201 [[337, 806], [383.0, 14162.0]]
Pr-F 1.9335233393040776 1.2484936862153932e-07 [[91, 1042], [629.0, 13926.0]]
En-Sd 0.35145328964427996 0.5276247876279874 [[1, 59], [719.0, 14909.0]]
En-Sp 0.6518003339655983 0.20032185973273325 [[11, 348], [709.0, 14620.0]]
En-W 0.0 0.6242399728422908 [[0, 23], [720.0, 14945.0]]
En-Pd 0.5338979996395747 0.25484121725338116 [[4, 155], [716.0, 14813.0]]
En-Pp 0.5848271446862996 0.09837506379891905 [[10, 352], [710.0, 14616.0]]
Tr-S 0.0 0.17423368865478425 [[0, 51], [720.0, 14917.0]]
Tr-P 0.23750913758223824 7.681181819389645e-10 [[12, 997], [708.0, 13971.0]]
Tr-I 1.2792254127605127 0.508398041890984 [[7, 114], [713.0, 14854.0]]
Hc-P 11.17482679749395 1.7147442153265933e-77 [[138, 311], [582.0, 14657.0]]
Hc-H 0.0 0.17958360073965343 [[0, 54], [720.0, 14914.0]]
NS 0.07965474170041392 3.1054193328126303e-12 [[3, 747], [717.0, 14221.0]]
Pr-A 0.33167954076825884 3.0183776897993817e-75 [[300, 5885], [1396.0, 9083.0]]
Pr-W 1.0250049895931344 0.6931305769882641 [[400, 3464], [1296.0, 11504.0]]
Pr-B 5.512115754613443 1.994864626945351e-119 [[405, 806], [1291.0, 14162.0]]
Pr-F 2.5528046755375358 1.8746207153446434e-32 [[272, 1042], [1424.0, 13926.0]]
En-Sd 0.4477759868651576 0.20696709017086629 [[3, 59], [1693.0, 14909.0]]
En-Sp 1.0407681355696774 0.798994250648637 [[41, 348], [1655.0, 14620.0]]
En-W 0.0 0.1620866229946606 [[0, 23], [1696.0, 14945.0]]
En-Pd 0.3392937583508303 0.0036959390074737582 [[6, 155], [1690.0, 14813.0]]
En-Pp 0.5708444275389882 0.007211874141119964 [[23, 352], [1673.0, 14616.0]]
Tr-S 0.0 0.0085183957127423 [[0, 51], [1696.0, 14917.0]]
Tr-P 0.3471508180129512 2.8063545112976028e-14 [[41, 997], [1655.0, 13971.0]]
Tr-I 1.319279437426204 0.3075459395350693 [[17, 114], [1679.0, 14854.0]]
Hc-P 3.8472340704770653 1.9346216971430265e-29 [[128, 311], [1568.0, 14657.0]]
Hc-H 0.48940080068254904 0.2758557015255797 [[3, 54], [1693.0, 14914.0]]
NS 0.06758869798721513 1.920563918861942e-27 [[6, 747], [1690.0, 14221.0]]
Pr-A 0.1624647855833296 1.6278070219087217e-63 [[62, 5885], [589.0, 9083.0]]
Pr-W 0.5465715336277666 2.023405810891417e-08 [[92, 3464], [559.0, 11504.0]]
Pr-B 0.1909860826410616 1.3767449328306686e-08 [[7, 806], [644.0, 14162.0]]
Pr-F 1.6879271297544647 0.0001249201569689377 [[73, 1042], [578.0, 13926.0]]
En-Sd 2.3506503744580214 0.053152944183913135 [[6, 59], [645.0, 14909.0]]
En-Sp 0.9908371286055085 1.0 [[15, 348], [636.0, 14620.0]]
En-W 4.017203144949936 0.024484895054830192 [[4, 23], [647.0, 14945.0]]
En-Pd 3.6580953850902915 4.2436721248647006e-07 [[24, 155], [627.0, 14813.0]]
En-Pp 1.7273454545454545 0.012501714757568927 [[26, 352], [625.0, 14616.0]]
Tr-S 1.8082856017213687 0.2914953837078631 [[4, 51], [647.0, 14917.0]]
Tr-P 3.2317405903912113 1.4726935990652035e-23 [[122, 997], [529.0, 13971.0]]
Tr-I 1.6211290278573571 0.171644244755815 [[8, 114], [643.0, 14854.0]]
Hc-P 0.2913670316126371 0.006096564170592466 [[4, 311], [647.0, 14657.0]]
Hc-H 0.8511099697540375 1.0 [[2, 54], [649.0, 14914.0]]
NS 6.836006517370805 5.6715846419086585e-68 [[172, 747], [479.0, 14221.0]]
Pr-A 1.4052262914046818 3.193497314751803e-38 [[4637, 5885], [5093.0, 9083.0]]
Pr-W 1.0885754409673547 0.005613221839185514 [[2402, 3464], [7328.0, 11504.0]]
Pr-B 0.019886605168510582 1.8738270089337863e-159 [[11, 806], [9719.0, 14162.0]]
Pr-F 0.7423213115955639 5.874878403594816e-08 [[512, 1042], [9218.0, 13926.0]]
En-Sd 1.1216972598257668 0.6119182997533829 [[43, 59], [9687.0, 14909.0]]
En-Sp 1.071541063363765 0.41807205283336224 [[242, 348], [9488.0, 14620.0]]
En-W 0.9362861796767322 1.0 [[14, 23], [9716.0, 14945.0]]
En-Pd 1.0024241287240494 1.0 [[101, 155], [9629.0, 14813.0]]
En-Pp 1.0456149845307996 0.6092700953581481 [[239, 352], [9491.0, 14616.0]]
Tr-S 0.9046088538508187 0.7330413293657585 [[30, 51], [9700.0, 14917.0]]
Tr-P 0.9192651879634084 0.11816325487723417 [[599, 997], [9131.0, 13971.0]]
Tr-I 0.8627237450132677 0.3567788730979695 [[64, 114], [9666.0, 14854.0]]
Hc-P 0.0581954525993135 2.8647253430364884e-52 [[12, 311], [9718.0, 14657.0]]
Hc-H 1.0256516058042775 0.914256715499572 [[36, 54], [9694.0, 14914.0]]
NS 0.8738047247933294 0.029821004531799478 [[427, 747], [9303.0, 14221.0]]
Pr-A 0.0509304959109864 6.0770285843974434e-114 [[23, 5885], [697.0, 9083.0]]
Pr-W 0.2746705099932279 3.524615789357567e-27 [[55, 3464], [665.0, 11504.0]]
Pr-B 15.46039818852082 5.623857672765395e-201 [[337, 806], [383.0, 14162.0]]
Pr-F 1.9335233393040776 1.2484936862153932e-07 [[91, 1042], [629.0, 13926.0]]
En-Sd 0.35145328964427996 0.5276247876279874 [[1, 59], [719.0, 14909.0]]
En-Sp 0.6518003339655983 0.20032185973273325 [[11, 348], [709.0, 14620.0]]
En-W 0.0 0.6242399728422908 [[0, 23], [720.0, 14945.0]]
En-Pd 0.5338979996395747 0.25484121725338116 [[4, 155], [716.0, 14813.0]]
En-Pp 0.5848271446862996 0.09837506379891905 [[10, 352], [710.0, 14616.0]]
Tr-S 0.0 0.17423368865478425 [[0, 51], [720.0, 14917.0]]
Tr-P 0.23750913758223824 7.681181819389645e-10 [[12, 997], [708.0, 13971.0]]
Tr-I 1.2792254127605127 0.508398041890984 [[7, 114], [713.0, 14854.0]]
Hc-P 11.17482679749395 1.7147442153265933e-77 [[138, 311], [582.0, 14657.0]]
Hc-H 0.0 0.17958360073965343 [[0, 54], [720.0, 14914.0]]
NS 0.07965474170041392 3.1054193328126303e-12 [[3, 747], [717.0, 14221.0]]
Pr-A 0.33167954076825884 3.0183776897993817e-75 [[300, 5885], [1396.0, 9083.0]]
Pr-W 1.0250049895931344 0.6931305769882641 [[400, 3464], [1296.0, 11504.0]]
Pr-B 5.512115754613443 1.994864626945351e-119 [[405, 806], [1291.0, 14162.0]]
Pr-F 2.5528046755375358 1.8746207153446434e-32 [[272, 1042], [1424.0, 13926.0]]
En-Sd 0.4477759868651576 0.20696709017086629 [[3, 59], [1693.0, 14909.0]]
En-Sp 1.0407681355696774 0.798994250648637 [[41, 348], [1655.0, 14620.0]]
En-W 0.0 0.1620866229946606 [[0, 23], [1696.0, 14945.0]]
En-Pd 0.3392937583508303 0.0036959390074737582 [[6, 155], [1690.0, 14813.0]]
En-Pp 0.5708444275389882 0.007211874141119964 [[23, 352], [1673.0, 14616.0]]
Tr-S 0.0 0.0085183957127423 [[0, 51], [1696.0, 14917.0]]
Tr-P 0.3471508180129512 2.8063545112976028e-14 [[41, 997], [1655.0, 13971.0]]
Tr-I 1.319279437426204 0.3075459395350693 [[17, 114], [1679.0, 14854.0]]
Hc-P 3.8472340704770653 1.9346216971430265e-29 [[128, 311], [1568.0, 14657.0]]
Hc-H 0.48940080068254904 0.2758557015255797 [[3, 54], [1693.0, 14914.0]]
NS 0.06758869798721513 1.920563918861942e-27 [[6, 747], [1690.0, 14221.0]]
Pr-A 0.1624647855833296 1.6278070219087217e-63 [[62, 5885], [589.0, 9083.0]]
Pr-W 0.5465715336277666 2.023405810891417e-08 [[92, 3464], [559.0, 11504.0]]
Pr-B 0.1909860826410616 1.3767449328306686e-08 [[7, 806], [644.0, 14162.0]]
Pr-F 1.6879271297544647 0.0001249201569689377 [[73, 1042], [578.0, 13926.0]]
En-Sd 2.3506503744580214 0.053152944183913135 [[6, 59], [645.0, 14909.0]]
En-Sp 0.9908371286055085 1.0 [[15, 348], [636.0, 14620.0]]
En-W 4.017203144949936 0.024484895054830192 [[4, 23], [647.0, 14945.0]]
En-Pd 3.6580953850902915 4.2436721248647006e-07 [[24, 155], [627.0, 14813.0]]
En-Pp 1.7273454545454545 0.012501714757568927 [[26, 352], [625.0, 14616.0]]
Tr-S 1.8082856017213687 0.2914953837078631 [[4, 51], [647.0, 14917.0]]
Tr-P 3.2317405903912113 1.4726935990652035e-23 [[122, 997], [529.0, 13971.0]]
Tr-I 1.6211290278573571 0.171644244755815 [[8, 114], [643.0, 14854.0]]
Hc-P 0.2913670316126371 0.006096564170592466 [[4, 311], [647.0, 14657.0]]
Hc-H 0.8511099697540375 1.0 [[2, 54], [649.0, 14914.0]]
NS 6.836006517370805 5.6715846419086585e-68 [[172, 747], [479.0, 14221.0]]
Pr-A 1.4052262914046818 3.193497314751803e-38 [[4637, 5885], [5093.0, 9083.0]]
Pr-W 1.0885754409673547 0.005613221839185514 [[2402, 3464], [7328.0, 11504.0]]
Pr-B 0.019886605168510582 1.8738270089337863e-159 [[11, 806], [9719.0, 14162.0]]
Pr-F 0.7423213115955639 5.874878403594816e-08 [[512, 1042], [9218.0, 13926.0]]
En-Sd 1.1216972598257668 0.6119182997533829 [[43, 59], [9687.0, 14909.0]]
En-Sp 1.071541063363765 0.41807205283336224 [[242, 348], [9488.0, 14620.0]]
En-W 0.9362861796767322 1.0 [[14, 23], [9716.0, 14945.0]]
En-Pd 1.0024241287240494 1.0 [[101, 155], [9629.0, 14813.0]]
En-Pp 1.0456149845307996 0.6092700953581481 [[239, 352], [9491.0, 14616.0]]
Tr-S 0.9046088538508187 0.7330413293657585 [[30, 51], [9700.0, 14917.0]]
Tr-P 0.9192651879634084 0.11816325487723417 [[599, 997], [9131.0, 13971.0]]
Tr-I 0.8627237450132677 0.3567788730979695 [[64, 114], [9666.0, 14854.0]]
Hc-P 0.0581954525993135 2.8647253430364884e-52 [[12, 311], [9718.0, 14657.0]]
Hc-H 1.0256516058042775 0.914256715499572 [[36, 54], [9694.0, 14914.0]]
NS 0.8738047247933294 0.029821004531799478 [[427, 747], [9303.0, 14221.0]]
Pr-A 0.23818595057928663 2.236952445294526e-153 [[323, 5885], [2093.0, 9083.0]]
Pr-W 0.770557040111269 2.0066880841137127e-06 [[455, 3464], [1961.0, 11504.0]]
Pr-B 7.788216215895717 2.0042205275118214e-259 [[742, 806], [1674.0, 14162.0]]
Pr-F 2.363068698678868 2.673743261926909e-35 [[363, 1042], [2053.0, 13926.0]]
En-Sd 0.41906287770188605 0.09873307057889252 [[4, 59], [2412.0, 14909.0]]
En-Sp 0.9241107026858821 0.6608429851608619 [[52, 348], [2364.0, 14620.0]]
En-W 0.0 0.06396153528908419 [[0, 23], [2416.0, 14945.0]]
En-Pd 0.39720590995629207 0.002089334647878005 [[10, 155], [2406.0, 14813.0]]
En-Pp 0.5750104909777591 0.0016823232139883632 [[33, 352], [2383.0, 14616.0]]
Tr-S 0.0 0.0008198459150934424 [[0, 51], [2416.0, 14917.0]]
Tr-P 0.3143000733049763 1.8402472560241576e-21 [[53, 997], [2363.0, 13971.0]]
Tr-I 1.3073402569970076 0.26468037308163117 [[24, 114], [2392.0, 14854.0]]
Hc-P 5.830796380767217 3.605088624399581e-81 [[266, 311], [2150.0, 14657.0]]
Hc-H 0.3433715522401805 0.05614670695258997 [[3, 54], [2413.0, 14914.0]]
NS 0.07118294532513102 2.5705566344327756e-37 [[9, 747], [2407.0, 14221.0]]
Pr-A nan 1.0 [[0, 5885], [0.0, 9083.0]]
Pr-W nan 1.0 [[0, 3464], [0.0, 11504.0]]
Pr-B nan 1.0 [[0, 806], [0.0, 14162.0]]
Pr-F nan 1.0 [[0, 1042], [0.0, 13926.0]]
En-Sd nan 1.0 [[0, 59], [0.0, 14909.0]]
En-Sp nan 1.0 [[0, 348], [0.0, 14620.0]]
En-W nan 1.0 [[0, 23], [0.0, 14945.0]]
En-Pd nan 1.0 [[0, 155], [0.0, 14813.0]]
En-Pp nan 1.0 [[0, 352], [0.0, 14616.0]]
Tr-S nan 1.0 [[0, 51], [0.0, 14917.0]]
Tr-P nan 1.0 [[0, 997], [0.0, 13971.0]]
Tr-I nan 1.0 [[0, 114], [0.0, 14854.0]]
Hc-P nan 1.0 [[0, 311], [0.0, 14657.0]]
Hc-H nan 1.0 [[0, 54], [0.0, 14914.0]]
NS nan 1.0 [[0, 747], [0.0, 14221.0]]
Pr-A 1.2764007850814194 4.0755655433711264e-21 [[4699, 5885], [5682.0, 9083.0]]
Pr-W 1.0501603041342333 0.10384414174317344 [[2494, 3464], [7887.0, 11504.0]]
Pr-B 0.030519439626903216 3.948499231679287e-159 [[18, 806], [10363.0, 14162.0]]
Pr-F 0.7981155299393619 2.1350211873678833e-05 [[585, 1042], [9796.0, 13926.0]]
En-Sd 1.198417619769418 0.377805338698898 [[49, 59], [10332.0, 14909.0]]
En-Sp 1.0664711599158936 0.45144754610565646 [[257, 348], [10124.0, 14620.0]]
En-W 1.1286390964510025 0.7514789076036937 [[18, 23], [10363.0, 14945.0]]
En-Pd 1.1647784459765487 0.2216479384996152 [[125, 155], [10256.0, 14813.0]]
En-Pp 1.0877345519249433 0.3199448127236231 [[265, 352], [10116.0, 14616.0]]
Tr-S 0.9611159434296576 0.9122768135132235 [[34, 51], [10347.0, 14917.0]]
Tr-P 1.045900745715407 0.3876751339707447 [[721, 997], [9660.0, 13971.0]]
Tr-I 0.91002751811141 0.5502457629914534 [[72, 114], [10309.0, 14854.0]]
Hc-P 0.07275039824539362 1.2754377804783065e-51 [[16, 311], [10365.0, 14657.0]]
Hc-H 1.0146995104937675 1.0 [[38, 54], [10343.0, 14914.0]]
NS 1.1657587892632344 0.0068058200909345435 [[599, 747], [9782.0, 14221.0]]
Pr-A nan 1.0 [[0, 5885], [0.0, 9083.0]]
Pr-W nan 1.0 [[0, 3464], [0.0, 11504.0]]
Pr-B nan 1.0 [[0, 806], [0.0, 14162.0]]
Pr-F nan 1.0 [[0, 1042], [0.0, 13926.0]]
En-Sd nan 1.0 [[0, 59], [0.0, 14909.0]]
En-Sp nan 1.0 [[0, 348], [0.0, 14620.0]]
En-W nan 1.0 [[0, 23], [0.0, 14945.0]]
En-Pd nan 1.0 [[0, 155], [0.0, 14813.0]]
En-Pp nan 1.0 [[0, 352], [0.0, 14616.0]]
Tr-S nan 1.0 [[0, 51], [0.0, 14917.0]]
Tr-P nan 1.0 [[0, 997], [0.0, 13971.0]]
Tr-I nan 1.0 [[0, 114], [0.0, 14854.0]]
Hc-P nan 1.0 [[0, 311], [0.0, 14657.0]]
Hc-H nan 1.0 [[0, 54], [0.0, 14914.0]]
NS nan 1.0 [[0, 747], [0.0, 14221.0]]
Pr-A 0.29641753925965103 2.551605400964937e-11 [[29, 5885], [151.0, 9083.0]]
Pr-W 0.7733873263943813 0.21244279261099747 [[34, 3464], [146.0, 11504.0]]
Pr-B 6.031739565201288 1.3467043737187781e-18 [[46, 806], [134.0, 14162.0]]
Pr-F 1.4849648112603966 0.13860694123361939 [[18, 1042], [162.0, 13926.0]]
En-Sd 1.41170343717451 0.512587328383429 [[1, 59], [179.0, 14909.0]]
En-Sp 0.23470108521158414 0.13485578173566978 [[1, 348], [179.0, 14620.0]]
En-W 0.0 1.0 [[0, 23], [180.0, 14945.0]]
En-Pd 1.619792236194642 0.43957989016331256 [[3, 155], [177.0, 14813.0]]
En-Pp 0.46654749744637386 0.4497847359448516 [[2, 352], [178.0, 14616.0]]
Tr-S 0.0 1.0 [[0, 51], [180.0, 14917.0]]
Tr-P 0.824296418667768 0.6527148928406563 [[10, 997], [170.0, 13971.0]]
Tr-I 2.208444840915849 0.16269180905899303 [[3, 114], [177.0, 14854.0]]
Hc-P 7.601389897313557 2.2132966867186292e-13 [[25, 311], [155.0, 14657.0]]
Hc-H 0.0 1.0 [[0, 54], [180.0, 14914.0]]
NS 0.2139043063640329 0.013627939739904006 [[2, 747], [178.0, 14221.0]]
Pr-A nan 1.0 [[0, 5885], [0.0, 9083.0]]
Pr-W nan 1.0 [[0, 3464], [0.0, 11504.0]]
Pr-B nan 1.0 [[0, 806], [0.0, 14162.0]]
Pr-F nan 1.0 [[0, 1042], [0.0, 13926.0]]
En-Sd nan 1.0 [[0, 59], [0.0, 14909.0]]
En-Sp nan 1.0 [[0, 348], [0.0, 14620.0]]
En-W nan 1.0 [[0, 23], [0.0, 14945.0]]
En-Pd nan 1.0 [[0, 155], [0.0, 14813.0]]
En-Pp nan 1.0 [[0, 352], [0.0, 14616.0]]
Tr-S nan 1.0 [[0, 51], [0.0, 14917.0]]
Tr-P nan 1.0 [[0, 997], [0.0, 13971.0]]
Tr-I nan 1.0 [[0, 114], [0.0, 14854.0]]
Hc-P nan 1.0 [[0, 311], [0.0, 14657.0]]
Hc-H nan 1.0 [[0, 54], [0.0, 14914.0]]
NS nan 1.0 [[0, 747], [0.0, 14221.0]]
Pr-A 1.1125397546903375 0.02818185763382116 [[834, 5885], [1157.0, 9083.0]]
Pr-W 1.057886606610281 0.32308910897636933 [[481, 3464], [1510.0, 11504.0]]
Pr-B 0.0 2.476691958845903e-45 [[0, 806], [1991.0, 14162.0]]
Pr-F 0.5303999639175516 2.097775509860076e-08 [[76, 1042], [1915.0, 13926.0]]
En-Sd 0.6361906224930445 0.4361429485871955 [[5, 59], [1986.0, 14909.0]]
En-Sp 0.8174279475725527 0.2631359298283783 [[38, 348], [1953.0, 14620.0]]
En-W 1.6359078768772712 0.36925019693946404 [[5, 23], [1986.0, 14945.0]]
En-Pd 0.8230251331829918 0.5509370900222303 [[17, 155], [1974.0, 14813.0]]
En-Pp 1.1135543157203807 0.4810843988915803 [[52, 352], [1939.0, 14616.0]]
Tr-S 2.5189125295508275 0.002009889597174559 [[17, 51], [1974.0, 14917.0]]
Tr-P 1.678727408321703 5.721312892154466e-10 [[213, 997], [1778.0, 13971.0]]
Tr-I 0.989106115491157 1.0 [[15, 114], [1976.0, 14854.0]]
Hc-P 0.09487391517532773 1.3715972330113823e-12 [[4, 311], [1987.0, 14657.0]]
Hc-H 1.8151705800846347 0.057215537490759756 [[13, 54], [1978.0, 14914.0]]
NS 1.406761169092046 0.0005787500204376309 [[137, 747], [1854.0, 14221.0]]
Pr-A nan 1.0 [[0, 5885], [0.0, 9083.0]]
Pr-W nan 1.0 [[0, 3464], [0.0, 11504.0]]
Pr-B nan 1.0 [[0, 806], [0.0, 14162.0]]
Pr-F nan 1.0 [[0, 1042], [0.0, 13926.0]]
En-Sd nan 1.0 [[0, 59], [0.0, 14909.0]]
En-Sp nan 1.0 [[0, 348], [0.0, 14620.0]]
En-W nan 1.0 [[0, 23], [0.0, 14945.0]]
En-Pd nan 1.0 [[0, 155], [0.0, 14813.0]]
En-Pp nan 1.0 [[0, 352], [0.0, 14616.0]]
Tr-S nan 1.0 [[0, 51], [0.0, 14917.0]]
Tr-P nan 1.0 [[0, 997], [0.0, 13971.0]]
Tr-I nan 1.0 [[0, 114], [0.0, 14854.0]]
Hc-P nan 1.0 [[0, 311], [0.0, 14657.0]]
Hc-H nan 1.0 [[0, 54], [0.0, 14914.0]]
NS nan 1.0 [[0, 747], [0.0, 14221.0]]
Pr-A 0.019787377731302964 4.926813612546631e-16 [[1, 5885], [78.0, 9083.0]]
Pr-W 0.04257713033694558 4.6024416856852484e-08 [[1, 3464], [78.0, 11504.0]]
Pr-B 18.02125087484889 2.696619432615863e-29 [[40, 806], [39.0, 14162.0]]
Pr-F 1.0984671206583756 0.8228515422912354 [[6, 1042], [73.0, 13926.0]]
En-Sd 0.0 1.0 [[0, 59], [79.0, 14909.0]]
En-Sp 0.0 0.26702551236445826 [[0, 348], [79.0, 14620.0]]
En-W 0.0 1.0 [[0, 23], [79.0, 14945.0]]
En-Pd 0.0 1.0 [[0, 155], [79.0, 14813.0]]
En-Pp 1.639055023923445 0.43685596099512336 [[3, 352], [76.0, 14616.0]]
Tr-S 0.0 1.0 [[0, 51], [79.0, 14917.0]]
Tr-P 0.1796543476583597 0.06442277689689606 [[1, 997], [78.0, 13971.0]]
Tr-I 1.6704903283850652 0.4553851297568427 [[1, 114], [78.0, 14854.0]]
Hc-P 18.189992666553845 1.2199182090405693e-18 [[22, 311], [57.0, 14657.0]]
Hc-H 0.0 1.0 [[0, 54], [79.0, 14914.0]]
NS 0.24407029828716575 0.18815211721269076 [[1, 747], [78.0, 14221.0]]
Pr-A 0.0116046275416664 1.3323109978182741e-27 [[1, 5885], [133.0, 9083.0]]
Pr-W 0.12872155683262618 7.521467592546501e-10 [[5, 3464], [129.0, 11504.0]]
Pr-B 0.0 0.0013994281800665611 [[0, 806], [134.0, 14162.0]]
Pr-F 1.1952155797949535 0.49837975221714775 [[11, 1042], [123.0, 13926.0]]
En-Sd 1.8999617688288517 0.41480363016823824 [[1, 59], [133.0, 14909.0]]
En-Sp 0.0 0.07895695912665159 [[0, 348], [134.0, 14620.0]]
En-W 0.0 1.0 [[0, 23], [134.0, 14945.0]]
En-Pd 3.7041760440110028 0.014029434043745742 [[5, 155], [129.0, 14813.0]]
En-Pp 1.2776223776223776 0.5607825725249646 [[4, 352], [130.0, 14616.0]]
Tr-S 0.0 1.0 [[0, 51], [134.0, 14917.0]]
Tr-P 5.345179869505423 1.0452593982858716e-13 [[37, 997], [97.0, 13971.0]]
Tr-I 0.9796860572483841 1.0 [[1, 114], [133.0, 14854.0]]
Hc-P 0.35435050649130867 0.5321926928594225 [[1, 311], [133.0, 14657.0]]
Hc-H 2.0765803397382343 0.38803197244119225 [[1, 54], [133.0, 14914.0]]
NS 16.892414729339894 2.886576056781655e-44 [[63, 747], [71.0, 14221.0]]
Pr-A 0.0 0.164107889841359 [[0, 5885], [5.0, 9083.0]]
Pr-W 0.0 0.5961521532563814 [[0, 3464], [5.0, 11504.0]]
Pr-B 11.713813068651778 0.026079405101370422 [[2, 806], [3.0, 14162.0]]
Pr-F 0.0 1.0 [[0, 1042], [5.0, 13926.0]]
En-Sd 0.0 1.0 [[0, 59], [5.0, 14909.0]]
En-Sp 0.0 1.0 [[0, 348], [5.0, 14620.0]]
En-W 0.0 1.0 [[0, 23], [5.0, 14945.0]]
En-Pd 0.0 1.0 [[0, 155], [5.0, 14813.0]]
En-Pp 0.0 1.0 [[0, 352], [5.0, 14616.0]]
Tr-S 0.0 1.0 [[0, 51], [5.0, 14917.0]]
Tr-P 0.0 1.0 [[0, 997], [5.0, 13971.0]]
Tr-I 0.0 1.0 [[0, 114], [5.0, 14854.0]]
Hc-P 31.419078242229368 0.004178032283020019 [[2, 311], [3.0, 14657.0]]
Hc-H 0.0 1.0 [[0, 54], [5.0, 14914.0]]
NS 0.0 1.0 [[0, 747], [5.0, 14221.0]]
Pr-A 0.06430897762673464 0.00010407710665741586 [[1, 5885], [24.0, 9083.0]]
Pr-W 0.6325745078631915 0.4844689905451397 [[4, 3464], [21.0, 11504.0]]
Pr-B 0.0 0.6429934066382045 [[0, 806], [25.0, 14162.0]]
Pr-F 1.8224568138195778 0.2513038850245739 [[3, 1042], [22.0, 13926.0]]
En-Sd 0.0 1.0 [[0, 59], [25.0, 14909.0]]
En-Sp 1.7504789272030652 0.4452821378371988 [[1, 348], [24.0, 14620.0]]
En-W 0.0 1.0 [[0, 23], [25.0, 14945.0]]
En-Pd 0.0 1.0 [[0, 155], [25.0, 14813.0]]
En-Pp 3.610671936758893 0.11695443918898807 [[2, 352], [23.0, 14616.0]]
Tr-S 0.0 1.0 [[0, 51], [25.0, 14917.0]]
Tr-P 3.503259779338014 0.02287063148107479 [[5, 997], [20.0, 13971.0]]
Tr-I 5.429093567251462 0.17522955164759707 [[1, 114], [24.0, 14854.0]]
Hc-P 0.0 1.0 [[0, 311], [25.0, 14657.0]]
Hc-H 0.0 1.0 [[0, 54], [25.0, 14914.0]]
NS 8.958815654775966 2.0046719627012813e-05 [[8, 747], [17.0, 14221.0]]
In [58]:
for label, d in df_dict.items():
    try:
        logFC_up = d[d['log2FoldChange_fb'] > 0.5]
        logFC_down = d[d['log2FoldChange_fb'] < -0.5]
        u.dp([label, len(d), 100 * (len(d)/len(df_all))])
        print('LogFC up: ', len(logFC_up), 100 * (len(logFC_up)/len(d)))
        print('LogFC down: ', len(logFC_down), 100 * (len(logFC_down)/len(d)))
    except:
        print("unable to run:", label)
--------------------------------------------------------------------------------
                       All genes	14968	71.61722488038278	                       
--------------------------------------------------------------------------------
LogFC up:  3030 20.243185462319616
LogFC down:  1554 10.38214858364511
--------------------------------------------------------------------------------
                  Genes with H3K27me3	2596	12.421052631578949	                  
--------------------------------------------------------------------------------
LogFC up:  1279 49.26810477657935
LogFC down:  212 8.166409861325114
--------------------------------------------------------------------------------
                Genes without H3K27me3	12372	59.196172248803826	                
--------------------------------------------------------------------------------
LogFC up:  1751 14.152925961849338
LogFC down:  1342 10.847074038150662
--------------------------------------------------------------------------------
              Genes with at least 1 sig	12797	61.229665071770334	               
--------------------------------------------------------------------------------
LogFC up:  2988 23.34922247401735
LogFC down:  1518 11.862155192623272
--------------------------------------------------------------------------------
                     Non-sig genes	2171	10.38755980861244	                      
--------------------------------------------------------------------------------
LogFC up:  42 1.9345923537540304
LogFC down:  36 1.6582220175034548
--------------------------------------------------------------------------------
                   Sig with H3K27me3	2416	11.559808612440191	                   
--------------------------------------------------------------------------------
LogFC up:  1271 52.60761589403974
LogFC down:  205 8.485099337748345
--------------------------------------------------------------------------------
                    NS with H3K27me3	180	0.8612440191387559	                    
--------------------------------------------------------------------------------
LogFC up:  8 4.444444444444445
LogFC down:  7 3.888888888888889
--------------------------------------------------------------------------------
                     Sig unmarked	10381	49.66985645933014	                      
--------------------------------------------------------------------------------
LogFC up:  1717 16.539832386089973
LogFC down:  1313 12.648107118774684
--------------------------------------------------------------------------------
                      NS unmarked	1991	9.526315789473685	                       
--------------------------------------------------------------------------------
LogFC up:  34 1.7076845806127574
LogFC down:  29 1.4565544952285283
--------------------------------------------------------------------------------
                  Sig Marked Perturbed	720	3.4449760765550237	                  
--------------------------------------------------------------------------------
LogFC up:  616 85.55555555555556
LogFC down:  58 8.055555555555555
--------------------------------------------------------------------------------
                Sig Marked un-Perturbed	1696	8.114832535885167	                 
--------------------------------------------------------------------------------
LogFC up:  655 38.62028301886792
LogFC down:  147 8.66745283018868
--------------------------------------------------------------------------------
                 Sig Unmarked Perturbed	651	3.1148325358851676	                 
--------------------------------------------------------------------------------
LogFC up:  474 72.81105990783409
LogFC down:  140 21.50537634408602
--------------------------------------------------------------------------------
               Sig Unmarked un-Perturbed	9730	46.55502392344498	                
--------------------------------------------------------------------------------
LogFC up:  1243 12.77492291880781
LogFC down:  1173 12.055498458376155
--------------------------------------------------------------------------------
               Exp. Sig Marked Perturbed	720	3.4449760765550237	                
--------------------------------------------------------------------------------
LogFC up:  616 85.55555555555556
LogFC down:  58 8.055555555555555
--------------------------------------------------------------------------------
              Exp. Sig Marked un-Perturbed	1696	8.114832535885167	              
--------------------------------------------------------------------------------
LogFC up:  655 38.62028301886792
LogFC down:  147 8.66745283018868
--------------------------------------------------------------------------------
              Exp. Sig Unmarked Perturbed	651	3.1148325358851676	               
--------------------------------------------------------------------------------
LogFC up:  474 72.81105990783409
LogFC down:  140 21.50537634408602
--------------------------------------------------------------------------------
             Exp. Sig Unmarked un-Perturbed	9730	46.55502392344498	             
--------------------------------------------------------------------------------
LogFC up:  1243 12.77492291880781
LogFC down:  1173 12.055498458376155
--------------------------------------------------------------------------------
        Sig genes with H3K27me3 and expression	2416	11.559808612440191	         
--------------------------------------------------------------------------------
LogFC up:  1271 52.60761589403974
LogFC down:  205 8.485099337748345
--------------------------------------------------------------------------------
                  Sig genes with H3K27me3 NO expression	0	0.0	                  
--------------------------------------------------------------------------------
unable to run: Sig genes with H3K27me3 NO expression
--------------------------------------------------------------------------------
           Sig genes unmarked and expression	10381	49.66985645933014	           
--------------------------------------------------------------------------------
LogFC up:  1717 16.539832386089973
LogFC down:  1313 12.648107118774684
--------------------------------------------------------------------------------
                    Sig genes unmarked NO expression	0	0.0	                     
--------------------------------------------------------------------------------
unable to run: Sig genes unmarked NO expression
--------------------------------------------------------------------------------
         NS genes with H3K27me3 and expression	180	0.8612440191387559	          
--------------------------------------------------------------------------------
LogFC up:  8 4.444444444444445
LogFC down:  7 3.888888888888889
--------------------------------------------------------------------------------
                  NS genes with H3K27me3 NO expression	0	0.0	                   
--------------------------------------------------------------------------------
unable to run: NS genes with H3K27me3 NO expression
--------------------------------------------------------------------------------
            NS genes unmarked and expression	1991	9.526315789473685	            
--------------------------------------------------------------------------------
LogFC up:  34 1.7076845806127574
LogFC down:  29 1.4565544952285283
--------------------------------------------------------------------------------
                     NS genes unmarked NO expression	0	0.0	                     
--------------------------------------------------------------------------------
unable to run: NS genes unmarked NO expression
--------------------------------------------------------------------------------
                   Sig ectopic marked	79	0.3779904306220096	                    
--------------------------------------------------------------------------------
LogFC up:  74 93.67088607594937
LogFC down:  0 0.0
--------------------------------------------------------------------------------
                  Sig ectopic unmarked	134	0.6411483253588517	                  
--------------------------------------------------------------------------------
LogFC up:  104 77.61194029850746
LogFC down:  0 0.0
--------------------------------------------------------------------------------
                   NS ectopic marked	5	0.023923444976076555	                    
--------------------------------------------------------------------------------
LogFC up:  0 0.0
LogFC down:  0 0.0
--------------------------------------------------------------------------------
                  NS ectopic unmarked	25	0.11961722488038277	                   
--------------------------------------------------------------------------------
LogFC up:  9 36.0
LogFC down:  0 0.0

15) Plot the general trend of the RNA seq

In [59]:
fb_genes = ['Emx1', 'Eomes', 'Tbr1', 'Foxg1', 'Lhx6']
mb_genes = ['En1', 'En2', 'Lmx1a', 'Bhlhe23', 'Sall4']
hb_genes = ['Phox2b', 'Krox20', 'Fev', 'Hoxb1',  'Hoxd3']
sc_genes = ['Hoxd8', 'Hoxd9', 'Hoxd10', 'Hoxd11','Hoxa7', 'Hoxa9', 'Hoxa10',
            'Hoxb9', 'Hoxb13',  'Hoxc8', 'Hoxc9', 'Hoxc10', 'Hoxc11', 'Hoxc12', 'Hoxc13']
progenitors = ['Sox2', 'Sox1', 'Sox3', 'Hes1', 'Hes5']
neurons = ['Tubb3', 'Snap25', 'Syt1', 'Slc32a1','Slc17a6']
glia = ['Pdgfra', 'Cspg4', 'Aqp4', 'Egfr', 'Slc6a11']

from collections import defaultdict
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from statannot import add_stat_annotation

hox_genes = sc_genes
forebrain_genes = fb_genes
h3cols = []
fb_rna_cols = []
sc_rna_cols = []

# Do it in this way so they are correctly ordered
for c in df_bg.columns:
    if 'H3K27me' in c and 'signal' in c:
        h3cols.append(c)
    elif 'wt' in c:
        if 'fb' in c:
            fb_rna_cols.append(c)
        if 'sc' in c:
            sc_rna_cols.append(c)
for c in df_bg.columns:
    if 'ko' in c:
        if 'fb' in c:
            fb_rna_cols.append(c)
        if 'sc' in c:
            sc_rna_cols.append(c)

wt_colour = "#AADFF1"   
    
def plot_gene_boxplot(df, title, cluster_id, cols):
    boxplot = Boxplot(df, gene_name, cols[0])
    boxplot.label_font_size = 6
    boxplot.title_font_size = 8
    sns.set(rc={'figure.figsize': (1.5, 1.5), 'font.family': 'sans-serif',
                    'font.sans-serif': 'Arial', 'font.size': 8.0}, style='ticks')
    boxplot.palette = sci_colour
    box_df = boxplot.format_data_for_boxplot(df, cols, gene_name, df[gene_name].values)
    is_hox = []
    sns.set_style("ticks")
    for c in box_df['Samples'].values:
        if 'Hox' in c:
            is_hox.append('Hox')
        else:
            is_hox.append('Hox like')
    box_df['is_hox'] = is_hox

    boxplot = Boxplot(box_df, "Conditions", "Values", add_stats=False, add_dots=False, 
                       order=cols)
    boxplot.label_font_size = 6
    boxplot.title_font_size = 8
    plt.rcParams['figure.figsize'] = (1.5,1.5)
    sns.set(rc={'figure.figsize': (1.5, 1.5), 'font.family': 'sans-serif',
                    'font.sans-serif': 'Arial', 'font.size': 8.0}, style='ticks')
    boxplot.palette = sci_colour
    ax = boxplot.plot()
    ax.tick_params(labelsize=6)
    
    plt.title(title, {'fontsize': 8, 'fontweight': 700})
    ax.set_ylim(0, 2.5)
    c = 0
    for b in ax.artists:
        if c < len(cols)/2:
            b.set_facecolor(wt_colour)
        else:
            b.set_facecolor(ko_colour)
        c += 1
    save_fig(f'{title}')
    
    plt.show()

avgs_fb = [['wt11fb1',
         'wt11fb2'],
         ['wt13fb1',
         'wt13fb2'],
         ['wt15fb1',
         'wt15fb2'],
         ['wt18fb1',
         'wt18fb2'],
         ['ko11fb1',
         'ko11fb2'],
         ['ko13fb1',
         'ko13fb2'],
         ['ko15fb1',
         'ko15fb2'],
         ['ko18fb1',
         'ko18fb2']]

avgs_sc = [['wt11sc1',
         'wt11sc2'],
         ['wt13sc1',
         'wt13sc2'],
         ['wt15sc1',
         'wt15sc2'],
         ['wt18sc1',
         'wt18sc2'],
         ['ko11sc1',
         'ko11sc2'],
         ['ko13sc1',
         'ko13sc2'],
         ['ko15sc1',
         'ko15sc2'],
         ['ko18sc1',
         'ko18sc2']]
avgs_hb = [['wt11hb1',
         'wt11hb2'],
         ['wt13hb1',
         'wt13hb2'],
         ['wt15hb1',
         'wt15hb2'],
         ['wt18hb1',
         'wt18hb2'],
         ['ko11hb1',
         'ko11hb2'],
         ['ko13hb1',
         'ko13hb2'],
         ['ko15hb1',
         'ko15hb2'],
         ['ko18hb1',
         'ko18hb2']]

avgs_mb = [['wt11mb1',
         'wt11mb2'],
         ['wt13mb1',
         'wt13mb2'],
         ['wt15mb1',
         'wt15mb2'],
         ['wt18mb1',
         'wt18mb2'],
         ['ko11mb1',
         'ko11mb2'],
         ['ko13mb1',
         'ko13mb2'],
         ['ko15mb1',
         'ko15mb2'],
         ['ko18mb1',
         'ko18mb2']]
for label, df in df_dict.items():
    if 'ectopic' in label:

        avgs_df = pd.DataFrame()
        avgs_df[gene_name] = df[gene_name].values
        fb_cols = []
        for f in avgs_fb:
            new_col = f'{f[0][:2]} {f[0][2:4]} FB'
            avgs_df[new_col] = (df[f[0]].values + df[f[1]].values) / 2.0
            fb_cols.append(new_col)

        sc_cols = []
        for f in avgs_sc:
            new_col = f'{f[0][:2]} {f[0][2:4]} SC'
            avgs_df[new_col] = (df[f[0]].values + df[f[1]].values) / 2.0
            sc_cols.append(new_col)

        mb_cols = []
        for f in avgs_mb:
            new_col = f'{f[0][:2]} {f[0][2:4]} MB'
            avgs_df[new_col] = (df[f[0]].values + df[f[1]].values) / 2.0
            mb_cols.append(new_col)

        hb_cols = []
        for f in avgs_hb:
            new_col = f'{f[0][:2]} {f[0][2:4]} HB'
            avgs_df[new_col] = (df[f[0]].values + df[f[1]].values) / 2.0
            hb_cols.append(new_col)
        plot_gene_boxplot(avgs_df, f'{label} FB', '', fb_cols)
        plot_gene_boxplot(avgs_df, f'{label} MB', '', mb_cols)
        plot_gene_boxplot(avgs_df, f'{label} HB', '', hb_cols)

        plot_gene_boxplot(avgs_df, f'{label} SC', '', sc_cols)
No handles with labels found to put in legend.
No handles with labels found to put in legend.
No handles with labels found to put in legend.
No handles with labels found to put in legend.
No handles with labels found to put in legend.
No handles with labels found to put in legend.
No handles with labels found to put in legend.
No handles with labels found to put in legend.
No handles with labels found to put in legend.
No handles with labels found to put in legend.
No handles with labels found to put in legend.
No handles with labels found to put in legend.
No handles with labels found to put in legend.
No handles with labels found to put in legend.
No handles with labels found to put in legend.
No handles with labels found to put in legend.
In [60]:
"""
---------------------------------------------------------------
            Read in results from DEseq2 and format DFs nicer
---------------------------------------------------------------
"""


# Want to consider genes that are DE in FB & MB vs SC and HB and then look at those 
# which are "consistently affected"
# i.e. for each gene make a "code of how it was affected"
df_all = pd.read_csv(f'{output_dir}df-all_epi-2500_{date}.csv')

df_dict = {}
deseq2_files = os.listdir(r_dir)
for f in deseq2_files:
    if 'DEseq2' in f:
        try:
            de_df = pd.read_csv(os.path.join(r_dir, f))
            de_df = de_df.rename(columns={de_df.columns[0]: 'u_id'})
            gene_names = [s.split('-')[1] for s in list(de_df['u_id'].values)]
            gene_ids = [s.split('-')[0] for s in list(de_df['u_id'].values)]
            de_df['padj'] = de_df['padj'].fillna(1) # Replace Nan p values with 1.0s
            # Replace Nans with 0's for other values
            de_df = de_df.replace(np.nan, 0)
            de_df[gene_id] = gene_ids
            de_df[gene_name] = gene_names
            df_dict[f] = de_df
        except:
            print(f)
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/IPython/core/interactiveshell.py:3062: DtypeWarning: Columns (4) have mixed types.Specify dtype option on import or set low_memory=False.
  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
merged_df_FEATURE_COUNTS_DEseq2Norm_20210124.csv

Plot the venn diagrams of overlapping DE genes

In [61]:
df_dict.keys()

fb_unique_genes = []
mb_unique_genes = []
hb_unique_genes = []
sc_unique_genes = []
anterior_genes = []
posterior_genes = []

# DEseq2_CNS_wt_fb-hb_20210124.csv
fb_genes_all = []
mb_genes_all = []
hb_genes_all = []
sc_genes_all = []
cutoff = 2.0
neg_cutoff = -2.0
for k, d in df_dict.items():
    if 'wt_fb-' in k:
        gs = d[d['padj'] < 0.05]
        gs = gs[gs['log2FoldChange'] > cutoff] # Only want those that are "selectively on " in FB
        fb_genes_all += list(gs[gene_name].values)
    if 'wt_' in k and 'mb' in k:
        gs = d[d['padj'] < 0.05]
        if k == 'DEseq2_CNS_wt_fb-mb_20210124.csv':
            gs = gs[gs['log2FoldChange'] < neg_cutoff]
        else:
            gs = gs[gs['log2FoldChange'] > cutoff]
        mb_genes_all += list(gs[gene_name].values)
    if 'wt_' in k and 'hb' in k:
        gs = d[d['padj'] < 0.05]
        if k == 'DEseq2_CNS_wt_hb-sc_20210124.csv':
            gs = gs[gs['log2FoldChange'] > cutoff]
        else:
            gs = gs[gs['log2FoldChange'] < neg_cutoff]
        hb_genes_all += list(gs[gene_name].values)
    if 'wt_' in k and 'sc' in k:
        gs = d[d['padj'] < 0.05]
        gs = gs[gs['log2FoldChange'] < neg_cutoff] # Only want those that are "selectively on " in FB
        sc_genes_all += list(gs[gene_name].values)
In [62]:
# Now see which are also in the consistently affected genes
const_aff = pd.read_csv(f'{output_dir}df-consistent_epi-2500_{date}.csv')
const_aff_genes = const_aff[gene_name]

plot_venn([set(fb_genes_all), 
           set(mb_genes_all), set(hb_genes_all), set(sc_genes_all)],
          ['FB', 'MB', 'HB', 'SC'], 'SpecificGenes_loc', colours=[fb_colour, mb_colour, hb_colour, sc_colour])

plot_venn([set(fb_genes_all) & set(const_aff_genes), 
           set(mb_genes_all) & set(const_aff_genes), 
           set(hb_genes_all) & set(const_aff_genes), 
           set(sc_genes_all) & set(const_aff_genes)],
          ['FB', 'MB', 'HB', 'SC'], 'SpecificGenes_CAFF_loc', colours=[fb_colour, mb_colour, hb_colour, sc_colour])
/Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/IPython/core/interactiveshell.py:3062: DtypeWarning: Columns (4) have mixed types.Specify dtype option on import or set low_memory=False.
  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,

This is just for supp. info

In [64]:
mb_caff = set(mb_genes_all) & set(const_aff_genes)
fb_caff = set(fb_genes_all) & set(const_aff_genes)
hb_caff = set(hb_genes_all) & set(const_aff_genes)
sc_caff = set(sc_genes_all) & set(const_aff_genes)

fb_spec = []
anterior_spec = []
for g in fb_caff:
    if g not in mb_caff and g not in hb_caff and g not in sc_caff:
        fb_spec.append(g)
    if g in mb_caff and g not in hb_caff and g not in sc_caff:
        anterior_spec.append(g)
        
mb_spec = []
for g in mb_caff:
    if g not in fb_caff and g not in hb_caff and g not in sc_caff:
        mb_spec.append(g)
        
posterior_spec = []
hb_spec = []
for g in hb_caff:
    if g not in mb_caff and g not in fb_caff and g not in sc_caff:
        hb_spec.append(g)
    if g in sc_caff and g not in mb_caff and g not in fb_caff:
        posterior_spec.append(g)
        
sc_spec = []
for g in sc_caff:
    if g not in mb_caff and g not in hb_caff and g not in fb_caff:
        sc_spec.append(g)
       


print(', '.join(fb_spec), '\n')
print(', '.join(mb_spec), '\n')
print(', '.join(hb_spec), '\n')
print(', '.join(sc_spec), '\n')
print(', '.join(anterior_spec), '\n')
print(', '.join(posterior_spec), '\n')
Cdca2, Cenpe, Satb2, Cdca5, Abcg8, Esco2, Alx3, Lipg, Ly6g6e, Dmrta1, Spc25, H2ac11, Mpped1, Drd1, Top2a, Melk, F2rl1, Isl1, Kif11, Vgll2, Arx, Bub1, Tiam2, Gsc, Spata18, Cdc25c, Gabrd, Knl1, Lhx8, Kif18b, Aurkb, Ticrr, Car12, Cldn6, Mfng, Foxg1, Sla, Depdc1b, Arhgap11a, Cdc20, Ccna2, Neurod6, Cdca7, Cenpa, Six6, Pclaf, Fam81b, Rrm2, Mki67, Shisa2, Ndc80, Neurod2, Ermn, Mn1, Ube2c, Kif22, Gsx2, Edar, Cdc45, Exo1, Alx1, Sstr2, Trp73, Kcnj4, Krt8, Foxo6, Wdr63, Ccnb1, Fzd8, Cdca3, Kif4, Zbtb18, 4833427G06Rik, Gipr, Vax1, Sox3, Cntnap3, Ect2, Clspn, Cldn3, Sgo1, Gucy1b2, Kif2c, Tfap2c, Prr32, Tbx15, Pif1, Ska1, Ppp1r1b, Lrr1, Mybl2, Wfdc2, 1700012B09Rik, Troap, Mms22l, AC151602.1, Eme1, Crym, Slc17a7, Ncapg, Sox5, Hrob, Hmgb2, Pbk, Uhrf1, Krt18, Bub1b, Neurog2, Mcidas, Rbm47, Cep55 

Foxa1, Foxd2, Ptf1a, Fgf15, Slc13a5, Krt73, Fgf3, Foxn4, Cnpy1, Gpr6, Aldh1a1, Bhlhe23, Sec14l5, Slc6a3, Sall4, Tafa4, Ttc6, Dbx1, Onecut1, Tal2, Pitx1, Cpne9, Ano1, Irx6, Gsc2 

Amn, Pnmt, Frmpd3, Aqp5, Atp13a4, Ren1, Cybrd1, Tmem151a, Tlx1, Slc39a12, Cbln4, Ramp1, Hmx3, Fbxo2, Grin2d, Rassf6, Hmx2, Cryba2, Mc4r, Tmem91, Cd164l2, Rnf207, Htr1a, Rcan2, Gal3st1, Kcnc4, Rasgrf2, Arid3c, Rasd2 

Olig3, Col1a2, Galnt6, Arhgap8, Cda, Twist2, Casp12, Frem3, Pgm5, Car3, Hoxc12, Prrx1, Foxl1, Bace2, Got1l1, Gfap, Susd5, Ajap1, Ccdc80, Zar1, Snai1, P2ry2, Hoxd9, Scx, Col1a1, Ltbp2, Wnt6, Tgfbi, Col12a1, Lilrb4a, Adra1d, Wnt4, Zpld1, C1qtnf7, Adamtsl5, Hoxa10, Runx3, Abi3bp, Lox, Stac2, Six1, Hoxd13, Nrk, Tmem200b, Cabp1, Mc5r, Aldh1a2, Pax9, Pik3r5, Mkx, Ppl, Adgrg6, Hmx1, Fam129a, Thy1, Col14a1, Ptprv, Ntf3, Scarf2, Pax1, Slc14a1, Tnxb, Stum, Cyp26a1, Scara5, Gfra1, Hoxc9, Trim63, Obscn, Ism1, Sfrp5, Pth2r, Wnt11, Tmem30b, Hnf1b, Cntn4, Hoxa9, Hoxd8, Tlx2, Tmem119, Kcns3, Gja3, Ptpn3, Lsp1, Itgb4, Hoxd10, Catsperz, Comp, Hoxc13, Odf4, Ccn4, Card11, Twist1, Il31ra, Tnfaip8l3, Hoxc10, Hoxc8, Hapln1, Tspan8, Ackr4, Msc, Smoc2, Igfbp5 

Otor, Pla2g5, Six3, Tbr1, Dlx6, Nr2e1, Otx1, Fgf8, Insm1, Dlx1, Dlx5, Helt, Fezf2, Tcf7l2, Slc22a2, Kcnj13, Slc22a6, Tfap2d, Lmx1a, Ptx3, Sim2, Pdzph1, Otx2, Emx1, Fgf17, Ntng2, Dlx2, Alx4, Dmrta2, Neurog1 

Ecel1, Mag, Pld5, Coro6, Vamp1, Hoxa5, Nms, Npy5r, Prss56, Creg2, Cdh19, Tacr1, Hoxa3, Fndc5, Tshz2, Chst9, Grem2, Hoxb9, Slc25a18, Kcnh6, Hoxd4, Tekt5, Ptger2, Aqp4, Sptb, Akr1c13, Hoxa4, Lhx3, Mpz, Gfra3, Serpine2, Plekha2, Serpina3g, Tmem114, Ablim2, Serpinb1b, Akain1, Wt1, D630039A03Rik, Krt9, Hoxb4, Vgll3, Crabp1, Hoxb6, Hoxc6, Cartpt, Hoxb8, Pi15, Ntrk1, Mnx1, Hoxb7, Skap2, Dkkl1, Pkd2l1, Car10, Kcnk4, Foxd3, Arg1, Vwa5b2, Hoxc5, Hoxc4, Smad9, Ubap1l, C1qtnf3, Cdh3, Hoxd3, Msx3, Gm867, Ankrd34c, Ctnna3, Gdf6, Lynx1, Ngfr, Slc18a3, Pla2g7, Crhbp, Minar1, Ppef1, Tmie, Hoxa7, Nppc, Hoxa6, Hoxb5, Folh1, Itih3, Sv2c, Gpat3, Fxyd7, Il33, Serpina3n, Gal, Flt3, Alox12b, Hoxb3, Fgf1, Oprd1, Klhl14, Ptger4, Pamr1, Dsc3, Ankrd34b 

Run AP_axis functional

Run the functional enrichment and then re-read in the values from the GSEA and use this to plot the pathway terms.

In [19]:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from adjustText import adjust_text

from sciviso import Vis


class Volcanoplot(Vis):

    def __init__(self, df: pd.DataFrame, log_fc: str, p_val: str, label_column: str, title='',
                 xlabel='', ylabel='', invert=False, p_val_cutoff=0.05,
                 log_fc_cuttoff=2, label_big_sig=False, colours=None, offset=None,
                 text_colours=None, values_to_label=None, max_labels=10, values_colours=None,
                 figsize=(1.5, 1.5), title_font_size=8, label_font_size=6, title_font_weight=700):
        super().__init__(df, figsize=figsize, title_font_size=title_font_size, label_font_size=label_font_size,
                         title_font_weight=title_font_weight)
        super().__init__(df)
        self.log_fc = log_fc
        self.p_val = p_val
        self.p_val_cutoff = p_val_cutoff
        self.log_fc_cuttoff = log_fc_cuttoff
        self.values_to_label = values_to_label
        self.label_big_sig = label_big_sig
        self.invert = invert
        self.label_column = label_column
        self.offset = offset
        self.label = 'volcanoplot'
        self.colours = {'ns_small-neg-logFC': 'lightgrey',
                        'ns_small-pos-logFC': 'lightgrey',
                        'ns_big-neg-logFC': 'grey',
                        'ns_big-pos-logFC': 'grey',
                        'sig_small-neg-logFC': 'lightgrey',
                        'sig_small-pos-logFC': 'lightgrey',
                        'sig_big-neg-logFC': '#df80ff',
                        'sig_big-pos-logFC': '#ffa366'} if colours is None else colours
        self.xlabel = xlabel
        self.ylabel = ylabel
        self.title = title
        self.figsize = figsize
        self.max_labels = max_labels
        self.values_colours = values_colours or {}
        self.text_colours = text_colours or {}

    def add_scatter_and_annotate(self, fig: plt, x_all: np.array, y_all: np.array,
                                 colour: str, idxs: np.array, annotate=False, s=20):
        x = x_all[idxs]
        y = y_all[idxs]
        ax = fig.scatter(x, y, c=colour, alpha=self.opacity, s=s, vmin=-10.0, vmax=10.0)

        # Check if we want to annotate any of these with their gene IDs

        if self.values_to_label is not None:
            texts = []
            labels = self.df[self.label_column].values[idxs]
            for i, name in enumerate(labels):
                if name in self.values_to_label:
                    lbl_bg = self.values_colours.get(name)
                    color = self.text_colours.get(name)
                    texts.append(fig.text(x[i], y[i], name, color=color, fontsize=6,
                                          bbox=dict(fc="white", alpha=0.5, boxstyle='round,pad=0.1', lw=0)))
            adjust_text(texts, force_text=2.0)
        # Check if the user wants these labeled
        if self.label_big_sig and annotate:
            # If they do have a limit on the number of ones we show (i.e. we don't want 10000 gene names...)
            max_values = -1 * self.max_labels
            if len(y) < self.max_labels:
                max_values = -1 * (len(y) - 1)
            most_sig_idxs = np.argpartition(y, max_values)[max_values:]
            labels = self.df[self.label_column].values[idxs][most_sig_idxs]
            x = x[most_sig_idxs]
            y = y[most_sig_idxs]
            # We only label the ones with the max log fc
            for i, name in enumerate(labels):
                name = (' ').join(name.split('_')[1:]) #Format nicer
                fig.annotate(name, (x[i], y[i]),
                             xytext=(0, 10),
                             textcoords='offset points', ha='center', va='bottom',
                             bbox=dict(boxstyle='round,pad=0.25', 
                                       fc=colour, alpha=0.2)
                             )
        return ax

    def plot(self):
        """
        For annotation styling see: https://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.annotate
        Returns
        -------

        """
        # if offset is not given, make the offset the smallest value in the dataset
        if not self.offset:
            vals = self.df[self.p_val].values
            self.offset = np.min(vals[np.nonzero(vals)])
            self.u.warn_p(['No offset was provided, setting offset to be smallest value recorded in dataset: ',
                           self.offset])

        # x axis has log_fc, first only plot the values < cutoff
        x = self.df[self.log_fc].values
        y = -1 * np.log10(self.df[self.p_val].values + self.offset)

        log_fc_np = self.df[self.log_fc].values
        p_val_np = self.df[self.p_val].values

        if self.invert:
            x = -1 * np.log10(self.df[self.p_val].values + self.offset)
            y = self.df[self.log_fc].values
        sig_small_pos_logfc = np.where((p_val_np <= self.p_val_cutoff) & (np.abs(log_fc_np) < self.log_fc_cuttoff)
                                       & (log_fc_np > 0))
        sig_big_pos_logfc = np.where((p_val_np <= self.p_val_cutoff) & (np.abs(log_fc_np) >= self.log_fc_cuttoff)
                                     & (log_fc_np > 0))

        sig_small_neg_logfc = np.where((p_val_np <= self.p_val_cutoff) & (np.abs(log_fc_np) < self.log_fc_cuttoff)
                                       & (log_fc_np <= 0))
        sig_big_neg_logfc = np.where((p_val_np <= self.p_val_cutoff) & (np.abs(log_fc_np) >= self.log_fc_cuttoff)
                                     & (log_fc_np <= 0))

        # Plot the points
        fig, ax = plt.subplots(figsize=self.figsize)
        self.add_scatter_and_annotate(ax, x, y, self.colours['sig_small-pos-logFC'], sig_small_pos_logfc)
        self.add_scatter_and_annotate(ax, x, y, self.colours['sig_big-pos-logFC'], 
                                      sig_big_pos_logfc, annotate=True, s=40)

        # Negative
        self.add_scatter_and_annotate(ax, x, y, self.colours['sig_small-neg-logFC'], sig_small_neg_logfc)
        self.add_scatter_and_annotate(ax, x, y, self.colours['sig_big-neg-logFC'], sig_big_neg_logfc, annotate=True, s=40)
        self.add_labels()
        ax.tick_params(labelsize=6)
        ax.tick_params(direction='out', length=2, width=0.5)
        ax.spines['bottom'].set_linewidth(0.5)
        ax.spines['top'].set_linewidth(0)
        ax.spines['left'].set_linewidth(0.5)
        ax.spines['right'].set_linewidth(0)
        ax.tick_params(labelsize=6)
        ax.tick_params(axis='x', which='major', pad=0)
        ax.tick_params(axis='y', which='major', pad=0)
        return ax
In [34]:
for f in os.listdir(fig_dir):
    if 'GSEA' in f and 'KEGG' in f and 'padj' in f:
        
        gsea_out = pd.read_csv(fig_dir + f)
        gsea_out = gsea_out.fillna(1.0)
        if len(gsea_out) > 1:
            volcanoplot = Volcanoplot(gsea_out, 'NES', 'padj', 'pathway', 
                                          f.split("_")[2], 'NES', '-log10(p adj)', 
                                          p_val_cutoff=1.0, max_labels=5,
                                          label_big_sig=True, log_fc_cuttoff=1.5, figsize=(1.5,1.5))
            sns.set_style("ticks")
            volcanoplot.plot()
            save_fig(f'Volcano_gsea_{f.split("_")[2]}')
            plt.show()
        else:
            print("Nothing significant for: ", f)
        
Nothing significant for:  GSEA_log2FoldChange_p11_padj_p11Rank_KEGG_.csv
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	6.1e-09	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	5.676274111573837e-07	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	0.4623655913978495	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	7.999738149709953e-06	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	6.0333333333333344e-09	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	0.09497298107778476	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	7.7489952815105e-06	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	0.009588451068305224	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	4.921090650064962e-06	
--------------------------------------------------------------------------------
Nothing significant for:  GSEA_log2FoldChange_a11_padj_a11Rank_KEGG_.csv
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	1.49e-08	
--------------------------------------------------------------------------------
In [24]:
for f in os.listdir(fig_dir):
    if 'GSEA' in f and 'Go' in f and 'svg' not in f:
        gsea_out = pd.read_csv(fig_dir + f)
        gsea_out = gsea_out.fillna(1.0)
        volcanoplot = Volcanoplot(gsea_out, 'NES', 'padj', 'pathway', 
                                      f.split("_")[2], 'NES', '-log10(p adj)', 
                                      p_val_cutoff=1.0, max_labels=5,
                                      label_big_sig=True, log_fc_cuttoff=1.5, figsize=(2,2))
        sns.set_style("ticks")
        volcanoplot.plot()
        save_fig(f'Volcano_gsea_GO_{f.split("_")[1]}_{f.split("_")[2]}')
        plt.show()
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	2.3654166666666663e-08	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	2.3704166666666663e-08	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	1.7254545454545453e-08	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	4.3907692307692315e-08	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	6.800000000000001e-09	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	7.046913580246914e-09	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	5.1890909090909094e-08	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	2.3479166666666663e-08	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	5.795638607604625e-08	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	8.354352765535156e-08	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	9.425e-09	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	1.4957894736842104e-08	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	1.0137499999999999e-08	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	7.049814265637555e-07	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	5.7692259773570684e-09	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	9.503333333333334e-08	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	5.4764150943396234e-09	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	1.726363636363636e-08	
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
No offset was provided, setting offset to be smallest value recorded in dataset: 	8.134285714285714e-09	
--------------------------------------------------------------------------------